


VOLUME 59 WHOLE No. 274 
NUMBER 4 1945 


Psychological Monographs 


JOHN F. DASHIELL, Editor 





An Investigation of a General 
Normality or Control Factor 


In Personality Testing 


By 
PAUL E. MEEHL 


University of Minnesota 


Published by 
THE AMERICAN PSYCHOLOGICAL ASSOCIATION, INC. 


~ 


Publications Office 


1515 MASSACHUSETTS AVE., N.W., WASHINGTON 5, D.C. 


tite 
it 


\ ‘ 
my 
Fi 
7 
f 





ACKNOWLEDGMENTS 


HE WRITER gratefully acknowledges his indebtedness to his ad- 

p yee Dr. Starke R. Hathaway, and to Dr. Howard F. Hunt, 

both of whom contributed many interesting and helpful ideas in 
conversations regarding the problem investigated. 

Special gratitude is due to Alyce I. Meehl for constant encourage- 

ment and for invaluable assistance in the collection and analysis of 


the data. 





lig 
re 





II. 


III. 


IV. 


CONTENTS 

Page 
IN «5s is 2G 4 a po Bae ne aa oo al bib cie gions wus ok au alten ili 
ES 5 vn 446 a Rota Mhbe dw eed es Vecndkeaaas 1 
Theoretical Considerations Regarding MMPI ....................... ” 
ENE SEE nis wis. 0. aang Bais oe o> WE Poe gee Ba 6d he 15 
PII 96's sa hp os ode be Backend bet PRC EeaN EO nO occas 15 
ee PE DOU? 4:5 s00 45s Ca RRe a eed ees 8 i Ss 15 
ee SN = oc sc ncnd eh deeiois voi niks Baweds vad hin 18 
I 6.50 9, 0.05, 0.00 6 Ce adh Ae eA igaincd «kd ak 20 
gE RR ae Sem ey ee OO ale ek a 22 
Reliability and Differentiating Power of the N-Scale .................. 25 
SE EN Wis sites UL SOOT. 5 PPAR Er ai Aw sles es 25, 
Differentiation of the Criterion Groups ...................... eee eeee 25 
Distribution of N for Unselected Normals and T-Scores Based Upon This 27 
Deviation of the “College Level” Group on the N Scale ............... 29 

Application of N-Scale to a “Test” Group of Normals with Elevated 
Se Gy Oo hi roar breeueemecksu o4eeey ctw u4:0ies 32 
Application to Cases with Elevations Other Than on the Neurotic Triad 37 
Miscellaneous Properties of N and Its Relation to Certain Other Variables 41 

N Considered as an Inverted “Lie” Scale in Terms of the Scores of a 
Group of Clinical Abnormals Showing Normal MMPI Profiles ........ 41 
Results of Application of N to Unselected Abnormals ................. 42 
The Relationship of N to Age and to the Other Variables of the MMPI 44 
Item Overlap Between N and Other MMPI Scales ................---. 46 
An Examination of the Ten Most Potent Items on N.................. 47 

V 


Nila 





Or eR Re 
~~ a 





Ten Most Powerfully Discriminating Items of N, with the Response 
Scored for the More “Normal” Indicated ........................ 


Item Overlap of N with the “Normal” Component of the Humm-Wads- 
SG IE CU 5 eens ce die sus tone deewe qiaewe 


A Minor Experiment in “Blind” Diagnosis Utilizing N ............... 
Vi. Popcmmeeel Repereatatiogs GEN... 6. ic cee cee reespascevens 
VER, Se BOD 6 kos ees eee e codec dvegtan npn sn ebuaedee bas oh 


TE ney re mere eer et rr eee ea are) a, rene rae 


vi 





59 





AN INVESTIGATION OF A GENERAL NORMALITY OR 
CONTROL FACTOR IN PERSONALITY TESTING! 


CHAPTER [ 


INTRODUCTION AND PROBLEM 


T HE PRESENT investigation had its ori- 
gin in several converging lines of 
thought, beginning with a consideration 
of Rosanoff’s theory of temperament as 
described briefly in his well-known man- 
ual (30), and related specifically to the 
problem presented by persons obtaining 
markedly abnormal scores on the Minne- 
sota Multiphasic Personality Inventory? 
but managing to stay out of the hands 
of the psychiatrist. Although it is ex- 
pected on purely statistical grourids that 
about 1 in 40 people in the “general 
population” will receive scores more than 
two standard deviations above the mean 
on any given scale, still it is of interest 
to consider such deviates with regard to 
their psychiatric condition. This is par- 
ticularly true because some of these 
scores are so markedly deviant, 2. e., four 
or five standard deviations above the 
“normal” mean, that they cannot reason- 
ably be looked upon as merely the tail 
cases of a normal distribution from the 
same supply as the normals whose statis- 
tics define the probabilities in question. 
Furthermore, it is not possible to dismiss 
all of these cases as simply “test misses” 
(not in itself a very descriptive term) 
when closer acquaintance typically shows 
them to be characterized by rather ab- 
normal amounts of the traits in question 


*This paper is a revision of the thesis sub- 
mitted to the Department of Psychology of the 
University of Minnesota in partial fulfilment of 
the requirements fox the degree of Doctor of 
Philosophy, March, 1945. 

* Henceforth the Minnesota Multiphasic Per- 
sonality Inventory will be referred to for brevity 
as MMPI. 





on clinical grounds. For example, we 
find a “normal” subject who shows a de- 
pression score 3.5 standard deviations 
above the mean of unselected “‘normals,” 
who has no language handicap or intelli- 
gence deficit which would invalidate his 
test results, and who complains of feeling 
blue, as if nothing was worth while, 
and of being filled with a continual rumi- 
native anxiety. Nonetheless, we find 
him attending his classes, achieving scho- 
lastic success, making adequate social 
adjustments, and in general behaving 
in such a way that no psychiatrist would 
consider him worthy of therapeutic at- 
tention unless he had no really ill pati- 
ents as competitors. On the other hand, 
one can find numerous patients in the 
psychopathic unit of the University Hos- 
pitals who are diagnosed as psychotic or 
reactive depressions and whose depres- 
sion has left them effectively incapaci- 
tated for ordinary social and economic 
interaction to the extent that they are in 
an institution under psychiatric care, yet 
with depression scores a standard devi- 
ation or more below that of this first case 
considered. The immediate problem is 
“How is such a thing possible?” 
Rosanoff’s theory of temperament, 
which was psychometrically embodied in 
the Humm-Wadsworth Temperament 
Scale, may be briefly summarized at this 
point. He considers three major com- 
ponents tending toward abnormality as 
manifested in full-blown psychiatric con- 
ditions to be present also in lesser degrees 
among “normal” individuals, ‘These com- 
ponents he calls the antisocial compo- 


AT ie 


‘ 
i Sa ae 
— 








2 ; PAUL E. 


nent, characterized by “imposed” moral- 
ity and a tendency to hysteroid and 
psychopathic manifestations; the cyclo- 
thymic component, characterized by 
mood swings, emotional lability, and a 
tendency to manic-depressive reactions; 
and the chaotic sexuality component, 
identifiable roughly with the tendency to 
schizophrenia. .Relatively independent of 
these three abnormal components he 
posits a fourth factor which is the ‘“‘nor- 
mal,” inhibiting, or controlling factor of 
temperament. The theory is that if one 
possesses a sufficiently great amount of 
this general normality or control factor 
in his make-up, he will be able to stay 
out of psychiatric difficulties and will 
successfully restrain any deviation that 
may exist in one of the others. He admits 
that the evidence for this theory is at 
present sketchy and inadequate, but still 
suggestive enough to warrant its serious 
consideration and investigation. The 
lines of evidence which he adduces in its 
behalf are ontogenetic, pharmacologic, 
pathological, and data from the phe- 
nomena of senile involution. 

The ontogenetic finding which sug- 
gests the presence of a general inhibiting 
and controlling factor of personality is 
the appearance of antisocial, cyclothymic, 
and chaotic-sexual behavior in children 
which in most cases is “outgrown,” and 
which when it appears in children does 
not afford ground for such pessimism as 
to ultimate outcome as the same _ be- 
havior does when appearing in the “ma- 
tured” adult. Rosanoff suggests that this 
is due to a differential rate of maturation 
of the (genetic) factors for the normal 
component in different persons, and that 
the appearance of such “abnormal” 
trends in children who turn out to be 
normal in adult life is due to the greater 
slowness of maturation of this inhibiting, 


MEEHL 


controlling factor of temperament. Those 
in whom the genetic basis for such a 
factor is basically weaker never grow out 
of the abnormal manifestations and as 
adults have a poor prognosis. 

In the pharmacologic realm, Rosanoff 
stresses as especially significant the wide 
individual differences in reactivity to 
alcohol. In persons who are normally 
controlled and inhibited, the effect of 
paralyzing the higher nervous centers 
through alcohol is to release very differ- 
ent kinds of behavior. In some persons, 
chaotic sexuality makes its appearance, 
e. g., the isolated appearance of homo- 
sexual manifestations in superficially 
“normal” males under the influence of 
alcohol. Others show antisocial, still 
others cyclothymic behavior. The hypoth- 
esis is that in the normal condition 
these components are already present in 
all persons (although in varying degrees), 
and that the direction of the uninhibited 
behavior is a function of the relative 
strength of the ordinarily controlled and 
inhibited tendencies in either of these 
three directions. 

The argument in the case of data from 
organic pathology and senile involution 
is practically identical. In the early 
stages of diffuse cerebral lesions, before 
the general mental deterioration of ad- 
vanced cerebral pathology has super- 
vened, phenomena of a schizophrenic, 
cyclothymic, and antisocial nature occur. 
It is common knowledge in psychiatry 
that although certain kinds of response 
may be expected more often than others, 
depending upon the nature of the 
pathology, nevertheless there is consider- 
able variation in the behavior syndrome 
which occurs when the central nervous 
system is ravaged by organic disease. For 
example, two persons have been ade- 
quately controlled and socially adapted 





Ol) 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 3 


prior to becoming ill. Both individuals 
become paretic in the late stages of 
syphilis, and one shows euphoria, ex- 
pansiveness, and many of the symptoms 
of an atypical affective psychosis. The 
other may instead become irascible, para- 
noid, and hallucinated. “Here again, 
these observations suggest that there ex- 
ists in the cerebrum an inhibiting and 
controlling apparatus; that the functions 
of this apparatus are among the most 
vulnerable of all the cerebral functions 
and are among the earliest to be lost in 
cases of diffuse cerebral lesions; and that 
the particular manifestations that appear 
in a given case, upon the loss of the 
inhibiting and controlling functions, de- 
pend on the particular temperamental 
components that have already existed in 
the individual, though in a state of com- 
plete or partial latency.” (30, pp. 667- 
668.) 

Although this conception of tempera- 
ment is not widely accepted as such, 
anyone who works with psychiatrists 
hears them frequently classifying patients 
into categories that are at least closely 
similar in their connotation to Rosanoff’s 
control factor. Thus two patients with a 
similar picture of monosymptomatic con- 
version hysteria may be sharply separated 
on the grounds that one has more in- 
tegration, is better controlled, “keeps 
her neurosis in hand” to a greater extent, 
and similar characterizations. 

The problem as originally posed could 
be stated something like this: Is there a 
fairly stable trait (variable, characteristic) 
of persons, with regard to which there 
exist wide individual differences, that is 
relatively uncorrelated with abnormal 
components of personality but which, if 
present, lessens the likelihood that a 
large amount of some abnormal compon- 
ent will result in psychiatric breakdown? 


If there is, how can it be measured? If it 
can be measured, what correlates does it 
have, what is its behavioral dynamic na- 
ture, and what are the people like who 
have a lot or a little of it? 

The thecretical importance of such a 
factor is too obvious to require discus- 
sion. On the other hand, even if its 
psychological nature were not elucidated 
satisfactorily, the practical importance 
of having a means of measuring it would 
be considerable. It would constitute an- 
other technique for reducing the errors 
of prediction and classification using a 
psychometric instrument such as MMPI. 
It is also conceivable that it would have 
prognostic and therapeutic significance, 
for example, in estimating the probabili- 
ties of staying well after discharge follow- 
ing psychotherapy. In the industrial and 
other situations in which inspection of a 
test profile is expected to yield a predic- 
tion of successful adjustment in the 
prospective situation, any score which 
would differentiate those persons who 
have deviant tendencies but can be rea- 
sonably expected to remain “adjusted” 
by some minimal standard from those 
who with scores no more deviant can be 
expected to prove inadequate, would be 
of tremendous advantage. For these 
several reasons both theoretical and prag- 
matic the development of such a scale 
(either of a very general nature or more 
specifically in respect to some particular 
test) would be of considerable value. 

A survey of the literature fails to reveal 
any empirical investigations of real 
relevance and merit, with almost no ex- 
ceptions. A careful study of articles in the 
Psychological Abstracts under the head- 
ings abnormality, adjustment, characters 
constitution, control, delinquency, frus- 
tration, inhibition, neuroticism, normal- 
ity, personality, persistence, psychoneu- 


AER IRR = 





— — 
—_ 


nz 


gi 


a 





4 PAUL E. MEEHL 


rosis, psychopathy, rigidity, stability, 
stress, temperament, volition, and will 
yields practically no material devoted to 
this aspect of personality testing, al- 
though there appear a number of discus- 
sions regarding such a hypothetical fac- 
tor of personality, under various guises 
and names. Of greatest (and in fact the 
only obvious) relevance is the work of 
Humm and Wadsworth, constructing a 
temperament scale with Rosanoff’s theory 
in mind (17, 18,19, 20, 21, 22). Of slight 
relevance one may mention studies of 
“persistence” (15), the theory of “frus- 
tration tolerance,” (31, 32, 33), and Free- 
man’s work on a “psychological plimsoll 
mark” (8). Freeman’s is the only one of 
the last three which uses a question- 
answer test to measure the hypothetical 
factor involved, the test in this case being 
the “normal” component of the Humm- 
Wadsworth scale. 

This last scale consists of 38 items 
selected empirically by the authors on 
the basis of overall differentiation be- 
tween a group of so-called “normal”’ per- 
sons who were defined so by being 
company employees adjusting satisfac- 
torily to their work and showing no 
evidence of abnormal trends in their case 
studies, and a heterogeneous group of 
patients committed to state hospitals as 
the “abnormals.” 

In the application of the Humm- 
Wadsworth scale to clinical diagnosis, the 
assumption was essentially based on the 
theory of Rosanoff, in that a subject with 
a deviant score on a given scale, e. g., 
schizoid-autistic, would not be considered 
as probably maladjusted to a serious de- 
gree if his “normal” component were 
also high, since the latter would act as an 
inhibitor or “brake” on the appearance 
of his abnormal tendency. Whether one 
considers this to have worked or not 


depends upon the evaluation he places 
on the Humm-Wadsworth, concerning 
which the evidence is somewhat conflict- 
ing (2, 5, 6, 18, 19, 22, 23). At any rate 
the problem of whether the scale called 
“normal” measures the hypothetical nor- 
mality factor of Rosanoff’s theory is not 
solved this easily, since such a set of 
items might serve the purely psycho- 
metric function required by the theory 
but on an entirely different basis. 

The items on the Humm-Wadsworth 
scale are not psychologically homogene- 
ous in any obvious way. However, for 
the vast majority of these items two 
things at least can be said. Firstly, they 
are scored for “normality” when an- 
swered in the direction that a naive 
person would consider more normal and 
adjusted by inspecting the item content, 
though this fact means less than many 
persons might think when one considers 
what is found in purely empirical item 
analyses such as done by Humm and 
Wadsworth or by the authors of the Min- 
nesota Multiphasic Personality Inven- 
tory. Humm and Wadsworth do not give 
the statistically preferred response for the 
general population, so it is impossible 
to say whether the response called “nor- 
mal” is that of the majority of normal 
persons. Nevertheless the method used 
by these authors in item selection was in 
this case the differentiation of their 
heterogeneous group of “‘abnormals”’ (de- 
fined by being institution inmates) from 
a group of persons rated as normal and 
making satisfactory vocational adjust- 
ment (17, p. 167). So that whether the 
item response scored “normal” on the 
Humm-Wadsworth is the response given 
by the majority of normal persons or 
not, at least it is known to be the re- 
sponse given significantly more often by 
normal persons than by abnormal per- 











eS 


S 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 5 


sons considered as a class. Many of the 
items on the Humm-Wadsworth normal 
scale are found, in slightly altered form, 
in the MMPI. Inspection of the scoring 
of these items on MMPI shows that 88 
per cent of the Humm-Wadsworth “nor- 
mal” items which appear in almost 
identical form in MMPI are responded 
to by the majority of the MMPI general 
population sample in the same direction 
that is scored “normal” on the Humm- 
Wadsworth. This fact, combined with 
the item content, suggests rather strongly 
that to achieve a high “normal” score on 
the Humm-Wadsworth it is necessary 
to avoid saying psychiatrically “bad” 
things about yourself of a sort which are 
so general among all varieties of abnor- 
mals that they show a differentiation 
when this latter group is considered with- 
out further breakdown. Examples of the 
H-W “normal” items are such responses 
as the following: 


Have your hardest battles been with your- 
self? (Yes) 

Has more than one person called you 
hotheaded? (No) 

Have you lost out in several undertakings 
by not making up your mind quickly 
enough? (No) 

Have you several times worked under 
people who seemed to have things fixed so 
that they would get the credit for good work, 
and those under them would be blamed for 
mistakes? (No) 

Have you at times felt obliged to ask your 
friends to help you personally even though 
you could not return the favor? (No) 

Do you think most people inwardly dislike 
putting themselves out to help other people? 
(No) 

Is it always best to keep your mouth shut 
when you are in trouble? (No) 

Have you ever taken up a line of work 
which called for close attention to fine de- 
tail, to the exclusion of pther activities? (No) 

Have you ever stayed away from another 
person because you feared doing or saying 
something you might afterwards regret? (No) 


Have you ever felt that difficulties were 
piling up to such an extent that you could 
not overcome them? (No) 

Have you at times been certain that people 
were talking about you although you couldn’t 


prove it? (No) 

Although there are, as always in a test 
of this sort, some responses of a puzzling 
and seemingly irrational character scored 
in a certain way, this list of items (every 
third item in the H-W normal scale) 
indicates, I think, the general tendency 
for the “normal” scored response to con- 
sist of a denial of a trait or symptom 
which if present would indicate malad- 
justment. This observation may seem 
utterly trivial but the importance of it 
will become clear when the interpreta- 
tion of the results of the present study is 
discussed. 

A further relevant finding is that the 
so-called “normal” component shows 
substantial negative correlations with 
several of the other components as found 
by three investigators (2, 6, 23), although 
for some unaccountable reason the co- 
efficients between the normal component 
and other components are reported as 
positive by Humm and Wadsworth. The 
highest negative correlations (ranging 
from —.53 to —.78) reported by these 
other three investigators are with the 
depression and paranoid scales. ‘The cor- 
relation of the “no-count,” as would be 
expected from the list of items in the 
sample above, is markedly positive with 
“normal.” 

It is not here denied that the H-W 
“normal” component is measuring the 
hypothetical inhibiting, controlling 
“brake” factor of Rosanoff’s theory. But 
it is obvious that the differentiation of 
heterogeneous abnormals from normal, 
adjusted persons by a given item does 
not establish prima facie that it is meas- 
uring such a component. The scale, the 


Ce ae 
~ 


. Le 
? . si 


“Sess 





hy 
if 


— 
- 


ptt oes 
ae 


+ 
ee oe 


a 


’ 
nt 

- 
‘ 


: 
ie 
é ~ . 
he 
ee 
> 
‘ 4 


ane a 


6 PAUL E. 


internal consistency of which is indicated 
by Humm and Wadsworth’s split-half 
reliability coefficient of .82 (17), might 
well be measuring some other compo- 
nent of personality than the one hypoth- 
esized by Rosanoff and still function 
statistically in the way suggested by the 
authors. ‘To mention only two possibili- 
ties, it might be measuring the overall 
freedom from reality-distortion that ap- 
pears in verbal behavior, so that a gen- 
eral collection of abnormals would score 
low. Or again, it might measure (as will 
be suggested on more evidence below) 
some “test-taking” trait of such a nature 
that while not itself intimately related 
to a specific psychiatric syndrome, it 
would be related to the tendency of per- 
sons, whether normal or abnormal, to 
answer items in a certain way. In which 
case a high “normal” component oc- 
curring in a person who was also high 
on some pathological component might 
operate to suggest improved chances of 
his being clinically normal in the same 
way the “no-count” does the converse. 
I do not wish to defend any particular 
hypothesis regarding the nature of the 
underlying personality trait sampled by 
the H-W “normal” items, but it must be 
understood for what follows that the dif- 
ferentiation of normals and abnormals 
by a given item is, as it stands, a rather 
ambiguous fact. So far as I know, the 
only empirical finding which tends in- 
dependently to support the Rosanoff 
theory as an interpretation of the psy- 
chological meaning of the H-W “normal” 
component is the study of Freeman (8) 


MEEHL 


mentioned above. 

In the present investigation the initial 
attack on the problem was made in terms 
of one particular personality test, the 
Minnesota Multiphasic Personality In- 
ventory, which has been described in de- 
tail elsewhere (9, 10, 11, 12, 13, 14, 24). 
In brief summary, this is an empirically 
constructed set of scales furnishing scores 
on nine personality variables named as 
follows: Hs (hypochondriasis), D (depres- 
sion), Hy (hysteria), Pd (psychopathic 
deviate), Mf (masculinity-femininity), Pa 
(paranoia), Pt (psychasthenia), Sc (schizo- 
phrenia), and Ma (hypomania), together 
with three scales used to determine the 
probable trustworthiness of the other 
nine. These last three are: ? (number of 
items answered “cannot say” and thus ef- 
fectively removed from all scales on 
which they appear), L (for “‘lie,”” meas- 
uring the deliberate or unconscious tend- 
ency of the sorter to put himself in a 
favorable light) and F (tendency to give 
statistically infrequent replies, usually 
suggesting lack of understanding or hap- 
hazard sorting). Before beginning the 
empirical investigation it was felt desir- 
able to formulate as well as could be 
done a priori the logically exhaustive 
alternative ways in which a person could 
come out with a deviant score on the 
Multiphasic on some scale and yet avoid 
being in a psychiatric hospital nonethe- 
less. This requires a preliminary con- 
sideration of the theory of the trait em- 
ployed in work with this instrument, and 
will be dealt with at some length in the 
next chapter. 








CHAPTER II 


‘THEORETICAL CONSIDERATIONS REGARDING MMPI 


N TERMS of Allport’s distinction be- 
I tween a “biosocial” and a “biophysi- 
cal” concept of a trait, it can be said 
without hesitation that the attitude prev- 
alent among those who work clinically 
with the MMPI is clearly biophysical. 
Although the diagnosis of a psychosis 
must necessarily be made at the time of 
decision in terms of the “social stimulus 
value” of the patient to the psychiatric 
staff, nevertheless this social stimulus 
value is not the intended content of the 
diagnosis. The best testimony to this 
fact is the very frequent occurrence at a 
ease conference of statements such as 
“He looks to me like a schizophrenic but 
as you say, he may not be.” It is regret- 
tably true in the field of psychiatry that 
precisely what it means to be a schiz- 
ophrenic is by no means clear. For those 
who are compulsively operational this is 
a frustrating state of affairs, but I do not 
see any advantage in the approach which 
says ‘““We do not yet know the internal 
state which is a necessary and sufficient 
condition for belonging to the class of 
schizophrenics, which we would better 
define if we could; therefore schizophren- 
ics are all those labeled as such in a 
psychiatric diagnosis.” In some respects 
the situation is very similar to that which 
existed in the case of a disease-entity 
such as “general paralysis of the insane” 
before Noguchi and Moore found the 
spirochete in the brains of paretics. If 
we could now go back and get Wasser- 
mann, gold colloid, etc., tests on all in- 
dividuals diagnosed G.P.I. from 1850 to 
1900, we should find 4 number who have 
nothing in common with the majority 
so classified except certain similarities of 


symptomatology. We know that if the 
pathological process is to be the real 
criterion which specifies an entity, these 
persons were misclassified. If the intent 
of the diagnosers of 1850 was to put 
those persons together who belonged to- 
gether because they had the same “dis- 
ease,” on these non-luetic persons they 
are now known to have been wrong. It 
is true that as long as their entity was 
only defined by the symptom syndrome, 
they cannot be said to have been 
“wrong” in a semantic sense; but inas- 
much as the cases were etiologically and 
pathologically heterogeneous, these for- 
mer investigators would now be willing 
to concede that all of the cases should 
not have been classified together, al- 
though at the time they had no way of 
knowing which ones. 

Another analogy can be found in the 
case of units of measurement in electric- 
ity. Some writers may take the coulomb 
as the basic unit, defined in terms of the 
deposition of silver at the cathode of a 
silver salt solution. In terms of this the 
current can be defined as time rate of 
passage of amount, the resistance in 
terms of current for a given EMF, and 
so on. The manifestations of the passage 
of an electric current may be considered 
from the standpoint of such diverse 
operations as the heat generated in a 
resistor, the amount of deflection of a 
compass needle, the rate of deposition of 
silver, the intensity of illumination of a 
bulb, and the like. It is fortunate for the 
physicist that the dependence of these 
different phenomena on the underlying 
“real” process is sufficiently close and 
straight-forward so that he does not have 





en 
= 


eee. 
ee ee 


= SSS 


ee eee 
= pam . 
aS: 


ee 
rh 


6 


CO more ste 
x 7 
+ 


Powe es iy 
eae s 


a Oe 


ee ee a 





8 PAUL E. 


a serious problem of “choosing” one of 
them as his definition of the amount of 
current, in order to be able to give an 
immediate answer to the question “How 
much current” in those cases in which 
the results of these diverse operations 
and observations failed to agree. Suppose 
now that a Maxwell’s demon exists in 
the silver solution and insists on con- 
founding us by arbitrarily stopping and 
starting ions in their passage to the 
cathode, Another demon exerts variable 
amounts of force on the needle of a gal- 
vanometer. Still another interferes in 
some way with the measurement of il- 
lumination from our bulb, and so on 
through all the operations defining 
“amount of current.” The result would 
be a failure of these diverse procedures 
to yield values of the “current” that 
would be isomorphic with the underly- 
ing reality (flow of electrons), and hence 
also they would fail to be in good cor- 
respondence with one another. 

Now the point I wish to make is that 
it is still not necessarily the best scien- 
tific procedure for the confused physicist 
to say “Well, there doesn’t seem to be 
any decent covariation in these phenom- 
ena, so I guess there isn’t any under- 
lying common class of events involved. 
Operationally speaking, I have to define 
what I mean by flow of current unam- 
biguously, so I'll pick galvanometer de- 
flection. In the future, whatever else is 
observed, whenever the galvanometer is 
deflected so much, I shall define that as 
the flow of so much current. Any dis- 
crepancies with other facts are just un- 
fortunate, that’s all.” This approach is 
positivistically elegant but in my opin- 
ion not calculated to bring about the 
most fruitful scientific thinking, particu- 
larly in a field which is in an embryonic 
state and in which the best criteria for 


v 





MEEHL 


inclusion of a definition or concept are 
not even yet known except dogmatically. 
Such covariation as does remain after the 
demons have done their work might be 
fruitfully utilized in an effort to get 
tentative interpretations including inter- 
pretations of the failures of correlation, 
and to improve the apparatus so as to 
keep the demons out. If the latter can 
be managed, the “true” underlying flow 
of electrons will finally be arrived at. 

To take a typical example in the 
clinical use of the Multiphasic, one 
might consider the case of hysteria. It 
has long been known by psychiatrists 
that aside from the presentation of con- 
version symptoms which define the clini- 
cal entity named hysteria, there is a more 
or less well defined personality structure 
characteristic of the persons so diag- 
nosed, At the more superficial levels one 
speaks of the tendency of the hysteric 
to shut out unpleasant facts from his 
conscious thinking; the “emotional im- 
maturity” showing up by parasitism, 
sexual frigidity, overdependence upon 
parents, selfishness in love relationships; 
a certain beautiful indifference to the 
symptoms; Rosenzweig’s “impunitive- 
ness” as a reaction to frustration; and so 
on. A psychoanalyst goes on to char- 
acterize even deeper layers of the per- 
sonality—phallic genital level, etc. ““Hys- 
teria” diagnostically refers to the pro- 
duction of certain disease-mimicking 
symptoms, yet a “hysteroid’’ tempera- 
ment is discernible and is in fact utilized 
in diagnosis, e.g., as between a diagnosis 
of hysteria and hypochondriasis. 

This clinical finding is supported by 
the empirical findings of test item anal- 
ysis arrived at in studying the responses 
to the MMPI of persons so diagnosed. 
There is in the set of items of the hysteria 
scale a group referring to somatic com- 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 9 


plaints, the core trait of the clinical 
entity as the medical man perceives it. 
But there is also a set of items reflecting 
a tendency to deny psychiatric and per- 
sonality problems, typified by responses 
such as saying “false” to such items as 
“I frequently have to fight against show- 
ing that I am bashful,” “I get mad easily 
and then get over it soon,” “Often I 
cannot understand why I have been so 
cross and grouchy,” and so on. The fact 
that this set while self-correlating is 
slightly negative in correlation with the 
somatic set does not prevent the sum 
of the two sets from differentiating hys- 
terical persons better than either one 
could do it alone. Although persons in 
general tend to answer these items in a 
negatively correlated way, yet a person 
with a clinical hysteria has a tendency to 
reverse this negative correlation and be 
a deviate in the same direction with re- 
gard to both. It would seem tentatively 
plausible that the “hysterical’’ person is 
characterized by his deviation on several 
unrelated (or in this case actually nega- 
tively related) components of personal- 
ity. How those components had best be 
defined can only be found by an empiri- 
cal investigation of their clinical cor- 
relates, the way they hang together, and 
how to give them the most psychologi- 
cal meaning. 

For the present purpose it will be con- 
venient to make a clear distinction, 
therefore, between the operationally 
measured score achieved on a scale with 
a certain name on the one hand, and the 
hypothetical amount of the trait pos- 
sessed on the other. In what follows if the 
biophysically intended trait is discussed 
it will be referred toby its name, where- 
as the score value will be indicated by 
the initial letters of the scale as is com- 
mon in present discussion of the MMPI 


at this institution. Thus, in the case of 
hysteria we have first, “hysteria,” the 
clinical entity showing conversion symp- 
toms and the like; secondly, we have 
hypothesized the “hysteroid component” 
(or components) which predispcise to 
hysteria but are not identical wij{: the 
appearance of symptoms; and _ 1), irdly, 
we have “Hy” which is the ol,,\.ined 
Multiphasic score. Allport’s con), ,dera- 
tion of the “individual” trait as 4,ean- 
ing that every person’s hysteroi,4:com- 
ponent is unique and cannot trul,* be ar- 
ranged on any such continuum -yill be 
theoretically conceded but can ol.wiously 
not be treated of in measureme}:t pro- 
cedures of the present nomotheyc sort. 

The construction of the MMPI has 
been detailed elsewhere and will “nly be 
summarily treated at this time. The pro- 
cedure has been with trivial exceptions 
a thoroughly empirical one which im- 
plicitly denies the usual assumption that 
one can deduce the verbal behavior of a 
certain sort of person in a test-taking 
situation from a_ priori «considerations 
alone, Allied with this denial is the at- 
titude that the verbal type of personal- 
ity inventory is not most fruitfully seen 
as a “‘self-rating’’ or  self-description 
whose value requires the assumption of 
accuracy on the part of the testee in his 
observations of self (26). Rather is the 
response to a test item taken as an in- 
trinsically interesting segment of verbal 
behavior knowledge regarding which 
may be of more value than any knowl- 
edge of the “factual” material about 
which the item superficially purports to 
inquire. Thus if a hypochondriac says 
that he has “many headaches” the fact 
of interest is that he says this, and that 
fact would not be controverted by any 
investigations establishing that he has 
interpreted the word “many” in a statis- 








10 PAUL E. 


tically rare fashion and actually does not 
suffer headaches any more frequently 
than the ordinary person. The now 
prevalent pessimism with traditional 
question-answer personality tests because 
they all require the subject to be both 
honest and objective in his self-descrip- 
tions simply does not apply because it is 
what he says of himself that is diagnostic 
rather than what is in fact the case. 

With this orientation the procedure 
of construction of the MMPI follows in 
a straightforward fashion. Consider any 
homogeneous clinical group for the de- 
tection of which a scale is to be con- 
structed. Then the items on such a scale 
are selected by the empirical determina- 
tion of response-frequencies for this 
group as compared with those for per- 
sons in general. It may be the case that 
other groups than the one intended also 
deviate in their responses to certain 
items. This is again determined em- 
pirically and such items are either elimi- 
nated or a new comparison of the group 
desired with the erroneously discrimi- 
nated group is made to create a “‘correc- 
tion scale” (10, 11). By this technique 
there have up to the present writing 
been constructed the nine scales men- 
tioned. 

Before embarking on the experimental 
material in the present investigation it 
is of interest to formulate the possibili- 
ties of a “test miss” in so far as they can 
be seen a priori. Granted the two simple 
and well established facts that (1) the 
scales tend to identify the people for 
whom they were named, (2) they do not 
by any means do so infallibly, the ques- 
tion arises “Why does it miss when it 
does?” I have tried to set down’ what 
seems to me an exhaustive list of the 
ways in which this could theoretically 
occur, keeping the above distinction be- 


MEEHL 


tween a scale value and the biophysical 
trait value in mind. Some of these, es- 
pecially I-B, are not always sources of 
error, as should be clear from the preced- 
ing discussions, but lead to errors of 
certain kinds in certain cases. 


I. “Errors of measurement” as regards the 
single trait score itself. This includes all 
those cases in which the personality vari- 
able underlying the scale responses does 
not actually exist in the amount indi- 
cated by the score. 


A. The answers given do not correspond 
to the facts as the patient himself 
sees them. 

1. Inadequate understanding of the 
questions due to feeblemindedness, 
organic brain disease, reading diffi- 
culties, bilingualism, etc. 

2. Failure to cooperate in the task re- 


quired. 
a. Deliberate misrepresentation— 
“lying.” 


b. Responding more or less at ran- 
dom due to boredom, hurry, 
fear of results. 

3. Unusual interpretation of the 
meaning of the questions of a sort 
other than those interpretations re- 
lated to the personality trait it is 
desired to detect. 

B. The patient does not see the facts as 
they are. 


II. The testee actually possesses the under- 
lying trait to the extent indicated by the 
score, but he fails to manifest it symp- 
tomatically and thus appears as a “test 
miss.” 


A. An external situation or a non-per- 
sonality trait exists in his life which is 
especially favorable for adjustment 
and thus he remains free of symptoms. 
Family protection, sheltering, financial 
inheritance, Murray’s “gratuities” such 
as beauty, high intelligence, etc. 

B. Other compensating personality traits 
exist. 

1. Other traits which can act to in- 
hibit the one in question. For in- 
stance, a potential psychopath is 
kept out of trouble by a concomit- 





As — 


_ eee th Gate 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 11 


ant psychasthenia that makes him 
be careful. 

2. The hypothetical “general control” 
factor posited by Rosanoff. 


It is Gbvious that some of the above 
sources of error are minimized by tech- 
niques included in the MMPI itself. For 
example, the case of deliberate misrep- 
resentation is to some extent at least 
overcome by the inclusion of the Lie 
scale and its. more recent experimental 
forms. A marked lack of understanding 
such as that which might occur in an 
almost illiterate or foreign-speaking sub- 
ject will almost inevitably raise the F- 
score and thus make the interpreter sus- 
picious. It should be clearly understood 
that the use of categories such a$ “un- 
usual interpretation” and “not seeing 
the facts as they are” does not contradict 
the empiristic verbal-sample attitude 
previously stated. 

Some varieties of unusual interpreta- 
tion, as in the case of frequency of head- 
aches, are correlates of the trait in which 
we happen to be interested, and the un- 
usual interpretation is actually a basis 
for our discrimination, On the other 
hand, there may be interpretive differ- 
ences among persons which are corre- 
lated with entirely different behavior 
variables including those which have no 
appreciable personality relevance. For 
instance, if one asks the question “How 
many are several books” he gets a con- 
siderable variation in the connotation 
people put upon this quite neutral, non- 
personological quantifier. Aside from the 
psychiatric factors which determine such 
differences in interpretation in MMPI 
items, this other kind of difference will 
also operate and obscure the results. It 
is a superficial answer to say that how a 
person interprets the word “several” is 
a reflection of his personality, since it 


may be a reflection of a personality trait 
which is minimally or even negatively 
correlated with the one for whose sam- 
pling the particular item was statistically 
selected. 

It might be advanced against the fore- 
going considerations that the nature of 
the statistical methods used to select 
items is such as to systematically exclude 
such factors. Naturally the relative fre- 
quency of occurrence of this kind of 
non-psychiatric determiner of unusual 
interpretations has been made low by 
the setting of high critical scores in the 
neighborhood of two standard devia- 
tions in a “normal” population which 
by definition must include those individ- 
uals who are psychiatrically normal but 
make such “misinterpretations” when 
confronted with the items. Thus it 
would be contended that all of those 
factors which are psychiatrically irrele- 
vant but weight the probabilities of a 
given item response have been taken 
care of in the setting up of norms. The 
statistical cogency of this argument may 
be fully conceded without detracting in 
any way from the value of investigating 
those cases who belong in the “missed”’ 
group no matter where the critical line 
is drawn. The point is that we would 
prefer to reduce still further the prob- 
abilities of a miss in either direction, 
and these combined probabilities are a 
function of the total overlap of groups 
as well as of the arbitrary critical score 
set. Thus, if you want to “catch’’ more 
of the actual cases of schizophrenia, you 
have to avoid missing some of those 
formerly missed. This means a lowering 
of the critical score, and immediately a 
rise in the frequency of normals er- 
roneously called schizophrenic. You have 
to pay for an improvement in one direc- 
tion by a loss in another. But if there 





“les 


™~ 


are nga aan maar are 





12 PAUL E, 


were procedures for isolating among your 
“normals” some of the sub-groups who 
achieve relatively deviant scores without 
being trait-deviant, these sub-groups 
could then be systematically identified 
and the net result would be a reduction 
of errors for any specified critical level 
chosen in practice. 

The entire problem can be stated suc- 
cinctly in terms of the probabilities in- 
volved in sampling from a collective of 
a defined character. The relative fre- 
quency of a “plus” response for any 
specified item is determined for the large 
and heterogeneous collective called “nor- 
mal” defined by their not being in a 
hospital or under the care of a physician. 
If a person belongs to this collective, 
the probability of his saying “Yes” to 
an item indicating abnormality is em- 
pirically known. Combine several such 
items scored in a given way, and the 
probability of a member of the given 
collective saying ‘‘Yes” to all of them is 
small. Combine a sufficiently large num- 
ber of such and the probability of his 
saying “Yes” to all of them may become 
vanishingly small, For some other group, 
e.g., the set of all hypochondriacs so diag- 
nosed, this probability is empirically 
known to be much larger. Now among 
the whole set of “normals” there are 
various sub-sets whose presence requires 
us to raise our critical score in order to 
avoid miscalling them hypochondriacs. 
Some of them are potential and semi- 
hypochondriacs, but not all of them. It 
is in these sub-sets of non-hypochondria- 
cal high-scoring normals that we are in- 
terested. Their high scores are a function 
of some variables we do not at the mo- 
ment measure, and if we could: isolate 
them among all the normals we would 
find ourselves with a little sub-collective 
for which the relative frequency of the 





MEEHL 


significant responses would be different 
from normals generally. That they are 
not a very sizeable minority is neces- 
sarily implied by the very definition of 
a “deviant” score, of course; but they are 
the important minority who produce the 
test misses. 

To reiterate at this point that “The 
Statistical frequencies take care of all 
that’ is to ignore the fact that sub-sets 
with diverse relative frequencies may be 
combined to produce a total collective 
in which the relative frequency is greater 
than some of the sub-sets and smaller 
than others. Any procedure for the 
identification of the members of such 
sub-sets will constitute an improvement 
in the measurement. It is necessary to 
realize that there is no such thing as 
“the probability” in the empirical sense 
of frequencies. If probability is a rela- 
tive frequency—and in this type of anal- 
ysis it cannot reasonably be interpreted 
in any other way—then “the probability” 
is undefined until a class for which the 
frequencies are to be determined is 
specified. With every additional fact one 
specifies a narrower sub-collective for 
which the relative frequencies continu- 
ally oscillate as the membership is re- 
duced. The “true” probability is not to 
be thought of as unfortunately unde- 
termined for some experimental or tech- 
nical reason but rather as being simply 
undefined until the limits of the class 
are specified. Furthermore, the set of 
such probabilities all stand on the same 
footing and cannot bé discriminated 
among as to which is “best.” If an in- 
surance actuary wishes to define the 
probability that I will die in the next 
year, he may begin with the relative 
frequency of deaths per year for white 
married males at my age. If we add the 
fact that I am of Norwegian ancestry 








— = “ fe te a 


A ed 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 13 


the probability will change, say for the 
better. If we add that I have a mitral 
regurgitation it changes again, for the 
worse, If we add that I have no cardiac 
enlargement, it changes again, for the 
better. If we add that I make a practice 
of running upstairs, it changes again for 
the worse, and so it goes. It is a mistake 
to assume that the successive values 
reached by such a procedure correct the 
probabilities previously arrived at, since 
the latter are based upon frequencies 
for larger classes and hence do not refer 
to the same quantity. We have a set of 
probabilities each of which is correct 
for the class to which it refers, and the 
isolation of a sub-set for the class for 
which the relative frequency is different 
cannot be said to refute the -former 
statement in any meaningful sense. 

It may be asked “What probability 
should be chosen in practice?” The an- 
swer to this question is given by Reichen- 
bach (29)—one should choose the most 
restricted class for which the number of 
cases is still large enough to yield stable 
relative frequencies. ‘To quote his ex- 
ample for the simplest case, 


“Imagine a class A within which an event 
of the type B is to be expected with the 
probability 1/2; if we wager, then, always on 
B, we get 50 per cent successes. Now imagine 
the class A split into two classes, A, and A,; 
in A,, B has a probability of 1/4, in A,, B 
has a probability of 3/4. We shall now lay 
different wagers according as the event of 
the type B belongs to A, or to A,; in the first 
case, we wager always on non-B, in the 
second, on B. We shall then have 75 per cent 
successes.” 


Although this example is taken from the 
probability of attributes, the principles 
apply in all essentials to the case of 
continuous variables. .; 

With this orientation one can see the 
so-called “‘validity” scales of the MMPI 


in a new light, namely, as procedures for 
the detection of persons who, although 
they belong to the “normal’’ collective, 
also belong to sub-collectives for which 
the relative frequencies of certain item 
responses (and hence of raw scores) are 
not the same as they are for the normal 
collective as a whole. If only 4 per cent 
of unselected “normals” achieve a score 
above two standard deviations on a given 
scale, we shall erroneously segregate that 
4 per cent if we were to use the test by 
itself for diagnosis. If among that 4 per 
cent are some who yielded deviant scores 
because they did not understand the 
English language, these persons may be 
detected by another measurement—the 
scale F. If they constitute any appreci- 
able proportion of this 4 percent, the 
use of the F-scale becomes profitable. 
Such a use of F for the isolation of a 
sub-collective with atypical frequencies 
among the normal in no ways contra- 
dicts the overall figure of 4 per cent 
which remains perfectly correct as a 
frequency obtained in the appropriate 
(larger) class. 

It is not unthinkable that the further 
development of clinically useful per- 
sonality inventories of the question-an- 
swer type may hinge in considerable part 
on the practical application of this line 
of thought. If, as now appears probable 
from our evidence, part of the discrimi- 
nating power of such test items arises 
from what is essentially a ‘“‘projective” 
element in the test-making, we shall do 
well to retain “ambiguities” with delib- 
erate intent. But there are probably 
few verbal ambiguities which cannot be 
settled projectively by the testee as a 
function of several different factors. Or- 
dinarily only one of these factors is of 
interest to us. The purpose of “validity 
scales” and “correction scales’ is to 


et Seer 


wT. 
Siac i 


|e 
HN 





Oe my wan ch cea 
yi . 


ee 


14 PAUL E. 


partial out, so to speak, the influence of 
the others. Horst has discussed this same 
problem with mathematical rigor under 
the name “suppression variables” and 
“clearing variates,” as follows (italics 
mine): 


“From the point of view of the variables 
involved, the foregoing treatment suggests 
a way of looking at the problem of statistical 
prediction which has not been previously 
emphasized. The conventional way has been 
to regard the prediction problem as con- 
sisting of two sets of variables, viz., the vari- 
ables to be predicted, that is the dependent 
variables; and the variables from which pre- 
dictions are made or the independent vari- 
ables. 

“The foregoing analysis suggests the utility 
of regarding the independent variables as 
consisting of two sets. The one set should 
have appreciable correlations with the de- 
pendent variable. The other set should have 
negligible correlations with the criterion or 
dependent variable but appreciable correla- 
tions with the other independent variables. 
For convenience, we may designate the first 
set of independent variables as the prediction 
variables and the second set the suppression 





MEEHL 


variables. These latter ,have been called 
“clearing variates” by Mendershausen. 

“From a common sense point of view, it is 
not difficult to explain what is happening 
in this type of prediction system. In most 
practical prediction problems, it is im- 
possible to find useful prediction variables 
which do not have appreciable components 
independent of the criterion variable, so that 
we are usually predicting part of the criterion 
and also other components independent of 
the criterion. What we need in this case is 
another set of variables independent of the 
criterion but correlated with those compo- 
nents of the prediction variables which are 
independent of the criterion. By means of 
the second set we should be able to suppress 
the irrelevant components of the prediction 
variables.” (16, p. 434). 

The present investigation will be seen 
from the evidence presented to consist 
essentially in the empirical isolation of 
a set of items on the MMPI which act 
in the role of suppression variables for 
at least some of the abnormal compo- 
nents, although the behaviorial nature of 
the components thus suppressed remains 
largely mysterious. 





CHAPTER III 


DERIVATION OF THE N-SCALE 


PLAN OF INVESTIGATION 


N ACCORDANCE with the usual empiri- 
I cal procedure of scale derivation 
which has been found fruitful in the 
development of the MMPI, the follow- 
ing plan of study was employed in the 
present investigation: 

1. A group of cases was selected by the 
fact that the profiles showed deviations 
that would ordinarily be labeled “ab- 
normal,” in spite of the apparent 
freedom of the subjects from incapacita- 
ting psychiatric involvement. ‘These 
cases will be referred to hereinafter by 
the phrase “criterion normals.” - 

2. A second group of cases was selected 
so that their profiles would match as 
closely as was practicable the profiles 
(case by case) of the first group, but this 
second group was drawn from the hos- 
pital population and thus was composed 
entirely of individuals whose psychiatric 
involvement was such as to result in 
their being in a psychopathic unit. This 
group is referred to hereinafter as the 
“criterion abnormals.”’ 

g. An item analysis was carried out 
on all 495 items! of the MMPI and those 
items which showed a statistically signif- 
icant differentiation between the two 
groups were brought together on one 
key to form a scale. 

4. The central tendency and variabil- 
ity of the resulting scale was obtained 
for a random group of “normal” per- 
sons, and standard scores calculated for 
further easy reference. 

5. The differentiating properties of 
this scale were studied by systematic ap- 


* Only 495 items were involved because of the 
original 504 items available on old forms, 9 have 
been eliminated in the present box. 


plication to various special groups, such 
as the original criterion groups, “test”’ 
normals and abnormals from diverse 
sources, college students, WPA workers, 
persons of different chronological ages, 
and so on. 

6. The internal consistency and test- 
retest stability of the scale were deter- 
mined by the usual procedures. 

7. The intercorrelation between the 
scale and the other scales of the MMPI 
were calculated for both normal and ab- 
normal populations. 

8. A minor experiment in the selec- 
tion of normal from abnormal profiles 
with and without the scale in question 
was performed. 

g. From an analysis of the preceding 
findings plus an inspection of the items 
on the scale, tentative interpretive pos- 
sibilities are suggested and possible ex- 
perimentation for this purpose _pro- 
posed. 


THE “CRITERION NORMALS” 


The first step was the selection of a 
group of cases showing abnormal pro- 
files by usual standards but not under 
psychiatric care. As is the case with all 
work on the MMPI it was necessary to 
accept a very crude and superficial de- 
finition of “normality.” Ideally, one 
would prefer to be able to have intimate 
psychiatric knowledge concerning the 
behavior and history of the persons so 
designated, but this is obviously impos- 
sible for practical reasons if it is desired 
to have a large and _ representative 
enough group to carry out an item 
analysis. Furthermore, it is necessary to 
avoid equating “normal” to “well-ad- 
justed” in some optimal meaning of the 





tone, : 





16 PAUL E. 


latter, since what is of clinical import- 
ance is in a fundamental sense this very 
crude brute fact of having managed to 
stay out of the hands of a psychiatrist. 
Thirdly, it is desirable in view of the 
previous era of overoptimism regarding 
the power of personality tests to define 
the situation so as to make it as “hard” 
for the test as possible, an end which is 
certainly facilitated by the blanket as- 
sumption that all persons who are not 
under a doctor’s care or in a psycho- 
pathic ward are by that token more 
“normal” than all of those who are. In 
the following discussion “‘normal’’ means 
that the profile is that of a person who 
at the time of examination was not 
under a doctor’s care. 

The entire collection of “normal” 
cases (N = 691) in the records was can- 
vassed in order to locate all- profiles 
showing any of the eight components 
(Mf has been ignored throughout the 
present study) with a T-score equal to 
or exceeding 70.? These 691 normal cases 
make up the standardization group upon 
which the other scales have been de- 
veloped, and their composition has been 
previously described (g, 10, 11, 12, 13). 

Since it was a priori doubtful whether 
the suppression variables for all eight of 
the abnormal components would turn 
out to be the same, the original item 
analysis was confined to those normals 
from this group who showed elevations 
(I = 70) on any of the three scales of 
the so-called “neurotic triad” (Hs, D, 
or Hy). In the event that the same sup- 
pression variables might operate in the 
case of other scales, this fact would be 
subsequently elicited; whereas if such 


All MMPI scales are expressed as T-scores, 
x—X 
where T = 50+ 10 





SD 





MEEHL 


were not the case the confounding of 
factors due to pooling several varieties 
of deviant-scoring normals might result 
in a failure to locate discriminating 
items. The first group studied, therefore, 
consisted: of persons scoring T = 70 on 
at least one (and possibly more) of these 
three scales, but no systematic effort was 
made to avoid cases in which some other 
scale was also elevated. The exclusion 
of all but “pure” neurotic triad profiles 
would have reduced the number avail- 
able for study beyond a workable limit. 

From this preliminary pool of deviant 
scoring normals a smaller group was 
selected by the application of certain 
important restrictions regarding the 
probable validity of the profiles. Be- 
cause it was clear already that the num- 
ber of cases for analysis would be quite 
small, the restrictions regarding prob- 
able validity were made fairly rigorous. 
In addition it was considered that to 
loosen these restrictions might result in 
a mere duplication of the F-scale, a sup- 
pression variable already extant. Ac- 
cordingly the following two validity re- 
strictions were imposed in the selection 
of a smaller set from the group consisting 
of all normals showing any T = 70: 

1. In order to make practically certain 
that no case was included in which the 
elevation of profile was due to careless- 
ness or lack of understanding, it was re- 
quired that the F-score of any case ac- 
cepted should not exceed T = 60, It 
was recognized that this would exclude 
an unknown proportion of individuals 
whose F was more elevated not because 
of non-cooperation or misunderstanding 
but because of validly “unusual” re- 
sponses reflecting the same personality 
trends as the elevated profile suggested. 
But for the above reasons it was felt 
preferable to lose this group than to in- 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 17 


clude cases where cooperation or under- 
standing was doubtful or in which the 
subject had become careless toward the 
end of the testing period. 

2. In order to avoid an excessive fre- 
quency of “Cannot say” responses in do- 
ing the item analysis on so few cases, as 
well as to exclude cases of doubtful 
validity indicated by such an excess, no 
case accepted showed a “?” score of 
T > 64. 

No restrictions were placed upon L 
(lie) because the error introduced by in- 
cluding persons who consciously or un- 
consciously put themselves in a favorable 
light would be opposite to the error one 
wished most to avoid—that of including 
cases in which the deviant profile was 
spuriously high due to some factor or 
other. 

The imposition of these restrictions 
reduced the total number of cases for 
study to only 42, of which 17 were males 
and 25 were females. No effort was made 
to make the sample “representative” 
with regard to such variables as age, sex, 
or socio-economic status, since this 
would have further reduced its size. The 
ages ranged from 20 to 45 with a mean 
age of 32.9 years (32.2 for the males and 
33.3 for the females). Except for such 
minimal comments as were written on 
the record sheet attached to each test, no 
further material was available regarding 
these persons. With the possible excep- 
tion of a self rating by the subject as 
to whether he considered himself “nerv- 
ous” or not, this additional information 
was of no value. Since this latter self 
rating is essentially a condensed variant 
of a great mass of such self judgments 
contained in the test itself it also is of 
little value, although it is worth noting 
that 18 of the 42 cases stated that they 
considered themselves to be “nervous.” 


TABLE 1 


Mean T-Scores on All Scales for the 42 
Criterion Normals 











Both Sexes Males Females 





(N =42) (N =17) (N =25) 
? 52.0 §2.3 51.7 
L 53-0 51.5 54.0 
F 53-0 52.9 53-0 
Hs 66.1 66.9 65.5 
D 66.8 66.1 67.3 
Hy 64.1 63.3 64.6 
Pd 56.5 55.2 57-4 
Pa 55-4 55.6 55-3 
Pt 60.6 60.6 60.6 
Sc 56.4 55-4 57-1 
Ma 48.5 47-5 49.2 





The mean T-scores for these 42 cases 
as well as for the two sexes separately 
are presented in Table 1. 

It will be noted that although every 
one of these cases was selected for show- 
ing either Hs, D, or Hy with T = 7o, 
the mean profile is within the limit 
usually called “normal,” since individ- 
uals do not tend to exceed 70 on all three 
in most cases. Twelve cases show scores 
= 8o, the other 30 scoring between 70 
and 79. Although the profiles were 
selected on the basis of the neurotic 
triad and apparent validity only, one no- 
tices that with the exception of Ma for 
males, all scales show a mean above the 
population average 50, and the general 
profile pattern is quite similar for the 
two sexes. The three validity indicators 
are very comfortably low. 

We have here, then, a group of 42 
persons who were cooperative and under- 
stood their task at the time they took the 
inventory;* who score two or more stand- 
ard deviations above the mean of the 
general population on one or more of 
the eight abnormal components; and 
who were not under the care of a physi- 


*From now on the term valid as applied to 
profiles will be used in this restricted sense, and 
has no connotation of “agreement with other 
criteria.” 





Ws > 


~a 








18 PAUL E. 


cian at the time. While it is true that 
these are merely the deviates of the “‘nor- 
mal” population on which the T-scores 
were derived, the important thing to 
keep in mind is that persons whose pro- 
files are no more deviant than these can 
be found in the hospital with an in- 
capacitating psychiatric involvement. It 
is in comparison with these latter per- 
sons that the present group of “normal 
deviates” are of interest. The dichotomy 
“normal” and “abnormal” may be arti- 
ficially made sharp by the definition in 
terms of “in or out of a hospital.” The 
dividing line in the case of the meas- 
ured (scale) component is even more ar- 
bitrary, and although the problem has 
been stated in terms of the critical score 
70, it is clear that if there were no “ab- 
normals” with scores as low as this the 
occurrence of a small group of scores 
among normals which exceed this criti- 
cal line would merely exemplify a statis- 
tical truism and the group involved 
would have little practical interest for 
us. As long as the analysis was ultimately 
to be carried through on matched groups 
in any event, one could probably have 
chosen some other arbitrary value as 
critical for selection of profiles, either 
higher or lower than 7o. 


THE ‘“‘CRITERION ABNORMALS” 


The 42 profiles obtained from the 
criterion normals were then considered 
one at a time with reference to the pro- 
files of the whole available population 
of hospital cases. Each “normal” profile 
was matched as closely as possible for 
all 8 components, as to sex, and as to 
age (always within 10 years). In spite of 
the laborious canvassing of over 400 ab- 
normals which was repeated with each 
of the 42 criterion normals, it was of 
course impossible to achieve an ex- 





MEEHL 


tremely close match to each criterion 
normal when eight variables in addition 
to age and sex were involved. Most at- 
tention was paid to the three scales of 
the neurotic triad, but all eight scales 
were taken into account to some extent. 
Organic cases were excluded. 

As regards the validity indicators, it 
was necessary in the case of the criterion 
abnormals to eliminate “lying’’ or at 
least that aspect or type of lying which 
is detected already by the L scale. For 
this reason no case was included in the 
criterion abnormal group if its L-score 
exceeded 60, with one exception which 
was unavoidable if any _ reasonable 
matching was to be made on the other 
variables. The “?’’ was not allowed to 
rise above 56, nor the F above 62. (These 
limits were not chosen arbitrarily to be- 
gin with, but are the upper limits finally 
obtained when matching was completed 
with the importance of these three vari- 
ables being kept in mind during the 
matching process.) 

As to clinical diagnosis, all 17 of the 
male matched abnormals were diagnosed 
psychoneurosis, as were all but three of 
the 25 female matched abnormals. These 
three cases were diagnosed manic-depres- 
sive manic (questionable diagnosis, 
atypical case); psychopathic personality 
with hysteria and anxiety; and paranoid 
condition with recurrent depression. 

The ages of these criterion abnormals 
range from 19 to 47 with a mean age of 
32.3 years (31.9 for men and 32.5 for 
women). 

The mean T-scores for the 42 criterion 
abnormals as well as by sexes separately 
are presented in Table 2, together with 
the corresponding scores for the 42 cri- 
terion normals with which they were 
matched. 

These results are graphically repre- 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 19 


TABLE 2 
Mean T-Scores for the 42 Criterion Abnormals, 
Sexes Separately and Pooled, Compared 


with Those of the 42 Criterion Normals 
with Whom They Were Matched 





42 Cri- 42 Cri- 17 25 
terion terion Male Female 
Nor- Abnor- Abnor- Abnor- 





mals mals mals mals 
? 52.0 50.7 50.2 51.1 
L a ie. ry Same 
F 53-0 51.8 §2.0 51.6 
Hs 66.1 62.9 61.4 63.9 
D 66.8 68.3 69.9 67.2 
Hy 64.1 65.9 63.4 67.5 
Pd $6.§ 53-7 52-5 54-5 
Pa sS-@ 55-7 96a 55-8 
Pt 60.6 55-5 56.1 55-0 
Sc 56.4 52.9 53-5 52-5 
Ma 48.5 “3 52.4 50.6 








| 
| 


sented in the accompanying profiles. 
(Fig. 1) It will be noted that the most 
marked difference in both sexes occurs 
in the case of Pt (psychasthenia), in 
which the normals are about one-half a 
sigma above the abnormals. This differ- 
ence may be partly the reason for the 
correlation later found between the 
derived N scale and Pt. On the other 


hand, it is difficult to see how such a 
systematic trend arose as a result of the 
matching process unless there were some 
factor in or associated with the Pt scale 
which contributes to “normality” in the 
presence of elevated scores on the neu- 
rotic triad. 

Of the eight differences between 
means of the criterion groups, three are 
statistically significant at the 5 per cent 
level. Pt is significantly higher for the 
normals (P < .001), Pd is significantly 
higher for the normals (P < .01), and Sc 
is significantly higher for the normals 
(P < .01). It is worth noting although 
inexplicable that in all three compari- 
sons the criterion normals have the more 
“abnormal” (elevated) score. 

The differences show that the match- 
ing procedure, in spite of the laborious 
profile-by-profile technique employed, 
was not completely successful in elimi- 
nating systematic differences between the 
two groups. The matching was as good 
as could be reasonably expected under 


M; P. P, Se M, 





sie 





140} 








Norgal 
Konornals |----+~ 





























90 
80 
70 
“A 60 a ~s 
- an oe 
50 — ~~ 
40 
30 
20 
10 ~ 






























































Fic. 1. Mean profiles of the 42 criterion normals and the 42 criterion abnormals 
with whom they were matched. 





—s 


> an 


My Fey hers 





20 PAUL E. 


these circumstances, and the differences, 
while statistically significant in three in- 
stances at least, are not in the direction 
which would be serious for the present 
investigation. Correlations subsequently 
found with other scales or random sam- 
ples of normals and abnormals are made 
ambiguous to interpret to some extent 
because of these differences in the orig- 
inal groups used in item selection. 
Nevertheless it must be clear that the 
correlations cannot be summarily dis- 
missed as being due to this matching 
factor, since it is necessary to explain the 
systematic matching difference which 
can hardly have been due to any bias 
on the part of the matcher. A “correla- 
tion” of variables in the supply might 
result in such systematic differences in 
matching equally easily as the converse. 


THE ITEM ANALYSIS 


We now have two groups of valid in- 
dividual profiles, one being from a set of 
persons who exhibit “deviant” MMPI 
profiles in terms of the arbitrary critical 
score 70 but are not in a psychiatric 
ward or under a physician’s care (“cri- 
terion normals’); the other from a set 
of persons with valid and no more devi- 
ant profiles than the first group but in 
the psychopathic unit because they have 
not been able to make an adjustment 
outside (“criterion abnormals’’). The 
next step is to determine whether there 
are any items which show a significant 
differentiation between these groups. If 
any such normality or suppression vari- 
able as has been provisionally hypothe- 
sized should actually exist, it seems a 
priori likely that at least some among the 
495 items of verbal behavior might be 
at least in part a function of this vari- 
able. It is not to be expected that a 
scale composed of items so selected will 





MEEHL 


be either behaviorally or mathematic- 
ally homogeneous. At the present stage 
of analysis it would be foolish to insist 
that it should be. More refined statistical 
procedures can be applied subsequently 
if the end of factorial purification is 
deemed worthy, although it is not a self- 
evident proposition that such is to be 
desired in any case. Any “homogeneity” 
of the first scale derived by this method 
consists precisely in the method of its 
derivation, namely, that it tends to dis- 
tinguish persons who get abnormal 
scores and are abnormal from those who 
get equally abnormal scores but are nor- 
mal. 

In scoring the MMPI a “significant” 
response to any given item is defined as 
that response (whether “Yes” or “No’’) 
which is made by a minority of the 
standardization group. These responses 
are indicated with a red plus (+) on the 
answer sheet and are also commonly re- 
ferred to as “plus” responses. In what 
follows the terms significant and plus 
will be used hereafter in this special 
sense, whereas the term “abnormal” will 
be used to designate the response char- 
acteristic of the criterion abnormals in 
the present study. As will be seen later, 
the two indications are not in perfect 
agreement by any means, rather the 
reverse being true. 

After tallying and summing the “sig- 
nificant” responses for each of the 495 
items for both criterion groups, these 
sums were then converted to propor- 
tions, giving the per cent significant res- 
ponses to each item by the criterion 
normals and by the criterion abnormals. 
All items for which the difference in 
proportion of significant responses be- 
tween normals and abnormals reached 
or exceeded .20, which is approximately 
at the 5 per cent level of significance in 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 21 


samples of 42 cases, were examined fur- 
ther by more exact tests. In a few cases 
items in which the proportions were very 
close to .00 or 1.00 were further investi- 
gated even though the differences were 
less than .20 because of the greater 
statistical significance of differences 
among proportions approximating these 
limits. 

The preliminary survey yielded 103 
items which showed differences in the 
neighborhood of .20 between proportion 
of plus responses for the two criterion 
groups. The significance of the differ- 
ences was then determined more exactly 
by the t-test for significance of differ- 
ence of two proportions, using the for- 
mula: 


items which show “statistical” signifi- 
cance when they are answered in the 
same direction by practically every mem- 
ber of both populations, for in such in 
stances the “significance” arises merels 
from the deviant values of the propo 
tions and the effective discriminating 
power of such an item when included in 
a scale is hardly worth the effort of scor- 
ing it. 

An examination of the differences be- 
tween proportions for those items show- 
ing p near .oo or 1.00 using the normal- 
ized C.R. of Zubin’s nomographs (34) 
shows that not more than six items would 
have been included had this more liberal 
test been employed. 

On applying the criterion of signifi- 


Pi — Pz 


t = — 





All items for which the value of ¢ from 
this formula equaled or exceeded 1.96 
were considered significantly differentiat- 
ing. 

It is true that this formula, while an 
improvement over the older formula for 
the standard error of a difference of two 
proportions, still involves approxima- 
tions when it is applied to proportions 
in the neighborhood of .oo and 1.00, for 
which the sampling distributions are so 
markedly skewed except for very large 
values of N. However, here also the error 
introduced is in a harmless direction, 
inasmuch as for such very deviant values 
of p the above formula for testing signifi- 
cance tends to underestimate the real sig- 
nificance of a difference, so that one is 
much more likely in such cases to miss a 
significant item than erroneously to seg- 
regate an actually non-differentiating 
one. In the case at hand, we are not 
especially desirous of locating all the 





4/ Nip: + = Nip: +. —*)( “a ) 
bundabilthihicmeell @ se. < 4 
Ni + Ne N, + Nz Ni Ne 








cance to the original set of 103 items, a 
smaller group of 78 items was finally se- 
lected which met this criterion. Of these 
78 items, g show differences 3 times or 
more their standard error; 62 show dif- 
ferences between 2 and g times their 
standard errors; and the remaining 7 
items show differences between 1.95 and 
2.00 times their standard errors (just 
reaching the 5 per cent level). Of the 7 
items with a C.R. between 1.95 and 2.00, 
4 are significant for one sex but not for 
the other. But in these cases the differ- 
ences are in the same direction for both 
sexes and not less than .12 in any case. 
This set of 78 items, scored as indicated 
in the next section, constitutes the scale 
hereinafter referred to as N. 

The initial “N” was originally chosen 
as a reference to “normal,” having in 
mind the theory of Rosanoff previously 
mentioned and the analogous scale called 
N by Humm and Wadsworth. It seems 


eS = 





eee 
: 


awe oe - 


tn ee ~~ 
REGO 
72 oe 


osama 


Me RS 





22 PAUL E., 


rather clear from what follows later that 
this N is “normality” only in the sense 
of a suppression variable or correction 
scale, and not in the original sense of a 
positive personality variable of inhibitory 
or “controlling” nature. 

The 78 items on the N scale are as fol- 
lows, the letter T or F indicating the di- 
rection in which the criterion normals 
tend to respond as contrasted with the 
criterion abnormals. In other words, 
those persons who show deviant profiles 
on the other MMPI scales but have re- 
mained in the community out of psychi- 
atric hands, have a greater tendency to 
give the indicated response than those of 
similar profile but in the hospital with a 
psychiatric disturbance. 


CONTENT OF ITEMS ON N SCALE 


(These items are listed according to their 
numbers as given in the back of the revised 
manual of directions, 1943. Indicated answer 
is more characteristic of the criterion nor- 
mals.) 

A-15 I am troubled by attacks of nausea 
and vomiting. (F) 


A-20 I have had attacks in which I could 
not control my movements or speech 
but knew what was going on around 
me. (F) 

A-go Peculiar odors come to me at times. 
(T) 

A-32 I can read a long while without tiring 
my eyes. (F) 

A-39 I have no trouble swallowing. (F) 

B-g I am almost never bothered by pains 
over the heart or in my chest. (F) 

B-51 I have been quite independent and 
free from family rule. (F) 

C-7___ My relatives are nearly all in sympathy 
with me. (F) 

C-21 I hate to have to rush when working. 
(T) 

C-25 I have often lost out on things be- 
cause I could not make up my mind 
soon enough. (T) 

C-33 It takes a lot of argument to convince 
most people of the truth. (T) 

C-34 When I am feeling very happy and 





MEEHL 


active, someone who is blue or low 
will spoil it all. (T) 

I liked school. (F) 

The only interesting part of the news- 
papers is the funnies. (T) 


C-35, 
C-46 


D-g Religion gives me no worry. (F) 

D-13 I feel sure there is only one true re- 
ligion. (T) 

D-15 I believe there is a God. (T) 

D-17 I believe in a life hereafter. (T) 

D-18 I believe in the second coming of 
Christ. (T) 

D-20 The only miracles I know of are 
simply tricks that people play on 
one another. (F) 

D-52 I think most people would lie to get 
ahead. (T) 

FE-11 At’ times I have been so entertained 
by the cleverness of a crook that I 
have hoped he would get by with 
it. (F) 

E-13 Policemen are usually honest. (F) 

F-15 It wouldn’t make me nervous if any 
members of my family got into 
trouble with the law. (F) 

FE-24 I prefer to pass by school friends, or 
people I know but have not seen for 
a long time, unless they speak to me 
first. (T) 

E-27 At parties I am more likely to sit by 
myself or with just one person than 
to join in with the crowd. (T) 

E-28 I love to go to dances. (F) 

E-49 It does not bother me that I am not 
better looking. (F) 

E-51 I do not like to see women smoke. 
(T) 

E-52 People often disappoint me. (T) 


F-4 I am easily embarrassed. (T) 

F-6 Iam not unusually self-conscious. (F) 

F-7 What others think of me does not 

bother me. (F) 

F-8 It makes me uncomfortable to put on 
a stunt at a party even when others 
are doing the same sort of things. 

\ 
(T) 

I have sometimes stayed away from 
another person because I feared 
doing or saying something that I 
might regret afterwards. (T) 

I feel unable to tell anyone all about 
myself. (T) 


F-17 


F-28 


People have often misunderstood my 
intention when I was trying to put 





G-1 
G-3 


G-14 


G-17 
G-27 
G-28 
G-29 
G-31 
G-32 


G-33 


G-40 
G-45 


G-49 


H-2 


H-18 


H-28 


H-34 


them right and be helpful. (T) 
I easily become impatient with people. 


(T) 


Criticism or scolding hurts me terribly. 
(T) 

Often, even though everything is going 
fine for me, I feel that I don’t care 
about anything. (T) 

I very seldom have spells of the blues. 
(F) 

I have often felt badly over being mis- 
understood when trying to keep 
someone from making a mistake. 
(T) 

I feel anxiety about something or 
someone almost all the time. (T) 

I believe my sins are unpardonable. 
() 

I tend to be interested in several 
different hobbies rather than stick 
to one of them fora longtime. (T) 

I like to keep people guessing what 
I’m going to do next. (T) ~ 

I am often said to be hot headed. (T) 

I am not easily angered. (F) 

I get mad easily and then get over it 


soon. (T) 


At times I feel like smashing things. 
(T) 

At times I feel like picking a fist fight 
with someone. (T) 

I have the wanderlust and I am 
never happy unless I am roaming 
or traveling about. (T) 

When I leave home I do not worry 
about whether the door is locked 
and the windows closed. (F) 

In walking I am very careful to step 
over sidewalk cracks. (T) 

Often I cross the street in order not to 
meet someone I see. (T) 

When someone says silly or ignorant 
things about something I know 
about, I try to set him right. (F) 

I have often thought that strangers 
were looking at me critically. (T) 

If given a chance I could do some 
things that would be of great bene- 
fit to the world. (T) 

I get anxious and upset when I have 
to make a short trip away from 
home. (T) 


Lightning is one of my fears. (T) 


A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 23 


H-35. A windstorm terrifies me. (T) 
H-36 I am not afraid of fire. (F) 

H-41 I have no fear of water. (F) 

H-43 I do not worry about catching dis- 


eases. (F) 
H-46 I am afraid to be alone in the dark. 
(T) 


H-47 I am afraid when I look down from 
a high place. (T) 

H-52 I have no fear of going into a room 
by myself where other people have 
gathered and are talking. (F) 

I-5 Horses that don’t pull should be beaten 
or kicked. (T) 

I-10 The future is too uncertain for a per- 
son to make serious plans. (T) 

I-11 It is great to be living in these times 
when so much is going on. (F) 

I-14 I frequently ask people for advice. 
(T) 

I-15, My plans have frequently seemed so 
full of difficulties that I have had to 
give them up. (T) 

I-17 I wish I could get over worrying about 
things I have said that may have 
injured other people’s feelings. (T) 

I-18 I often must sleep over a matter before 
I decide what to do. (T) 

I-30 My way of doing things is apt to be 
misunderstood by others. (T) 

I-38 I often think, “I wish I were a child 
again.” (T) 

I-40 I am entirely self-confident. (F) 

I-41 Once in a while I think of things too 
bad to talk about. (T) 


Of the whole group of 78 items, it 
should be noticed that by far the majority 
of them are answered in the “significant” 
or “plus” direction by the criterion nor- 
mals more than by the criterion abnor- 
mals. There are only 14 exceptions to 
this trend, namely, items A-15, A-20, 
C-21, D-15, D-17, D-18, E-11, E-15, E-51, 
G-3, G-14, I-5, I-14, J-41. Furthermore, it 
is evident that in the great majority of 
the items the response most characteristic 
of the criterion normals is not only the 
statistically unusual one but is the one 
which on face inspection of the item 
would be considered the “bad” or “un- 











24 PAUL E. 


adjustive” response. Although this kind 
of observation is of less value than is 
sometimes believed in tests of this sort, 
nevertheless the very marked tendency 
for the criterion normals to say “bad” 
things about themselves more than the 
matched abnormals is of considerable in- 
terest. 

A scoring key was next constructed in 
which the direction indicated above as 
characterizing the criterion normals more 
than the criterion abnormals was the 
scored response. Thus, if a subject sorts 
card A-15 into the “False” category he 
receives one raw score point for N (“nor- 





MEEHL 


mality”); while if he sorts card A-go into 
the “True” category he receives one raw 
score point. High raw scores on N, there- 
fore, indicate “normality” whereas low 
raw scores indicate “abnormality,” at 
least as far as persons having deviant pro- 
files are concerned. In what follows, the 
raw N-score of a person, therefore, means 
the number of responses he made to the 
N items which agree with the response 
indicated by the letters T and F in the 
foregoing list—the number of times he 
responded in the direction preferred by 
the criterion normals. 





CHAPTER IV 


RELIABILITY AND DIFFERENTIATING POWER OF THE N-SCALE 


RELIABILITY 

OR 50 normal males whose mean and 

SD were practically identical with 
that of the sample of “men in general” 
on which T scores were subsequently 
based, the odd-even reliability of N was 
81, which indicates a fair degree of in- 
ternal consistency of the items. For 50 
normal females of high school age re- 
tested at intervals varying from 4 to 14 
months, the test-retest reliability was .74, 
which is somewhat lower than the typical 
test-retest reliabilities reported thus far 
for MMPI scales, although these previous 
reliabilities were based upon a different 
sample. 

In the case of personality measurement 
the desirability of high test-retest re- 
liabilities is dependent in part upon 
whether “function variability’ itself is 
great or small. If the hypothetical “con- 
trol” factor of Rosanoff were involved, 
we should require any alleged measure 
of it to show a fairly good retest stability. 
Since we do not know the psychological 
nature of what is measured by N, it is 
not possible to say to what extent the co- 
efficient of .74 is lowered by test “unreli- 
ability” and to what extent it would have 
to be lowered to some degree below unity 
by a function fluctuation which if not 
detected by the scale would be indicative 
of the latter’s invalidity. 

In the case of split-half reliability the 
same problem presents itself as was pre- 
viously referred to in connection with the 
dynamic inhomogeneity of the items on 
the Hy scale. Further, “purification” of 
N will have to be undertaken for other 
reasons, but not simply to increase the in- 
ternal consistency which may or may not 


25 


be valued, depending upon the practical 
situation. 


DIFFERENTIATION OF THE CRITERION 
GROUPS 


When the scoring key for N had been 
constructed the first step was to score the 
criterion cases on the whole key to de- 
termine whether or not the scale was able 
to differentiate them sufficiently well to 
justify applying it to other test groups. 
That the differentiation of the criterion 
groups should be good is only a necessary 
and by no means a sufficient condition for 
accepting such a scale, since the most im- 
portant test lies in its application to 
“test”’ cases, other than those cases upon 
which the item analysis was made. 

The distribution of scores for the cri- 
terion normals and criterion abnormals 
is discussed in the next section. It will 
be noted that the separation, while by 
no means perfect or as clean as could be 
desired, is certainly significant and the 
overlap not excessively great. It is neces- 
sary to avoid undue leniency toward the 
instrument being studied by attempting 
to “explain” all exceptions to the rule 
desired, so that the overlap that exists 
must be taken at its face value in evalu- 
ating the separation. Nevertheless, it 
should at least be noted that it would be 
of considerable interest to have informa- 
tion regarding the adjustment of those 
persons who constitute the low tail end 
of the distribution of criterion normals. 
For this reason it is perhaps not out of 
place to refer to the meager data avail- 
able on the information sheets attached 
to their records, from which we learn 
that the three male criterion normals 


Pe 


nan 5? tls 
bs fa 
OL LE EI 
— a “ 


o> weet : 
Sansa =) 


‘ é, 
4 





we ewer eng - == + 


ie < Spe in 
ee i 


~~ 


ee re ee 
a.” ae 











PAUL E. 


MEEHL 


TABLE 3 
Comparisons of Means and Variances by Sex and Psychiatric Status 
A. Tests for homogeneity of variance by sex and psychiatric status 




















N d.f. V F P 
42 criterion normals 
Sex group: 
Male normals 17 16 120.6518 
1.206 P>.05 
Female normals 25 24 100.0833 
42 criterion abnormals 
Sex group: 
Male abnormals 17 16 68.7330 ; 
1.437 P>.05 
Female abnormals 25 24 98.8433 
Criterion group: 
42 normals 42 41 108.5271 
1.206 P>.05 
42 abnormals 42 41 89.9891 
B. Significance of difference of means of males and females 
Mean SD Diff tairt P 
Normals: 
17 male normals 42.1765 10.656 
3.2235 .985 .30<P<.40 
25 female normals 45.4000 9.802 
Abnormals: 
17 male abnormals 22.8824 8.043 
4.6376 1.627 .10<P<.20 
25 female abnormals 27.5200 9.741 





having the lowest raw scores on this dis- 
tribution (20, 21, and 27) were all un- 
employed at time of testing. Of the three 
female criterion normals showing raw 
scores of 26, one is a college student (the 
significance of which will be elaborated 
subsequently), another is unmarried at 
age 3o and has a deviant church affilia- 
tion. All three of these female normals 
consider themselves “nervous” according 
to the record sheet. 

More precise statistical tests of the sig- 


nificance of difference between the groups 
show that the variances of the males and 
females do not differ significantly, nor 
do their means, within either the normal 
or abnormal group. Nor does the vari- 
ance of the normals differ significantly 
from that of the abnormals. (Summary 
of data in Table 3.) 

Consequently the sexes were pooled 
and the t-test applied to test the signifi- 
cance of difference between means of 
the normals and abnormals. The mean 


TABLE 4 


Comparison of Means on N Scale for the 42 Criterion Normals 
versus the 42 Criterion Abnormals 











Group N Mean | SD Meangirt C.R. P 
Criterion normals 42 44.0952 10.293 
18.4523 8.487 P<.oo1 
Criterion abnormals 42 25.6429 9-373 











A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 27 


score of the 42 normals was 44.10, with 
a SD of 10.293; the mean score of the 42 
abnormals was 25.64, with a SD of 9.373. 
The test of significance between these two 
means shows t = 8.48, which, with 80 
degrees of freedom, is clearly significant 
(P < .001). Data for these various com- 
parisons are shown in Table 4. 

Inasmuch as the difference is taken be- 
tween matched groups, a more appropri- 
ate test of significance is the ¢ based upon 
differences between matched members of 
each pair. When this analysis is made, 
we find that the best estimate of the 
SEgier is 1.5167 which yields ty of 12.42. 
Of the 42 pairs involved, in only one case 
did a given normal have a lower N-score 
than the criterion abnormal with whom 
he was matched. This was a female col- 
lege student whose case was mentioned 
above in regard to overlap. 

The median of all 42 normals is 46, 
which is exceeded by one case from the 
abnormals. Considering the sexes sepa- 
rately, no male abnormal reaches or ex- 
ceeds the median of the male normals; 
and one female abnormal reaches or ex- 
ceeds the median of the female normals. 
The inference from these findings was 
that the N-scale had sufficient differen- 
tiating power on the criterion groups to 
warrant further study on other groups. 


DISTRIBUTION OF N FOR UNSELECTED 
NORMALS AND T-SCORES BASED 
UPON THIS 


It was necessary to make a preliminary 
standardization in terms of T-scores even 
though it was realized that the scale 
would not be used in its first form. This 
latter was required for convenient inter- 
pretation when it was;found that al- 
though the sex differences are not signifi- 
cant in the criterion groups (although, 
it will be noted, in the same direction 


among both normals and abnormals) 
there were significant differences between 
the means of men and women sampled 
at random from the general population 
of “normals.” 

Inasmuch as the T-scores to be derived 
were merely for convenience in the pres- 
ent study and would not be used in prac- 
tical work, it was sufficient to base the 
standardization on a smaller number of 
cases than usually is employed—in this 
case upon 100 males and 100 females. 
The standard error of the mean given 
by 100 cases is only about 1 point if we 
assume the SD obtained to be reasonably 
accurate at around 10 raw score points. 
Accordingly, 100 cases were selected at 
random from the “normal” files, taking 
30 cases from each age group, 16-25, 26- 
35, 36-45, and the remaining 10 cases 
from the age group 46-55. Cases with 
L = 70, “?” = 60, and F = 60 were ex- 
cluded as of doubtful validity; otherwise 
the first consecutive 30 cases in each age 
group were taken from files. 

For the 100 normal males, the mean N- 
score was 29.14 with a SD of 9.55; for the 
100 normal females the mean was 35.59 
and the SD 9.80. This difference is in the 
same direction as that found in the cri- 
terion cases, although in this case the dif- 
ference is based upon sufficient cases to 
reach statistical significance. The vari- 
ances do not differ significantly as seen 
in Table 5, but the means are significant- 
ly different. 

It should be pointed out that the fore- 
going difference is not only statistically 
stable but is of such magnitude as to be 
of practical importance and warranting 
separate norms; since a difference of six 
raw score points is 6/10 of a standard 
deviation for both groups. Since high T- 
scores on the MMPI indicate abnormality 
(in the sense that higher scores increase 





“Ni 


900 ware 
wee 








28 PAUL E. MEEHL 


TABLE 5 


Test of Significance of Difference of Means and Variances for 
100 Normal Males and 100 Normal Females 





Group N 





F Meangirt 


C.R. P 





Mean V 
Males 100 29.14 9.55 
1.026* 6.45 4.712 P< .oo1 
Females 100 35-59 9.80 





* The F of 1.026 is not significant at the 5 per cent level, so the variances are considered not to 


differ. 


the odds that the person is or will be 
psychiatrically ill) this custom has been 
preserved in the present scale, although 
there are arguments on both sides as to 
its advisability. The relation of raw 
scores to T-scores is, therefore, an inverse 
one, contrary to that which obtains in 
all the other MMPI scales. If a person 
showing elevated scores on the person- 
ality components has a low raw score, he 
responds to the N items in the direction 
more characteristic of the criterion ab- 
normal group; and this indicates a great- 
er probability of his being clinically 
abnormal than if he had _ responded 
more frequently in the scored direction 
on N. Hence such a low raw score is 
given a high T score in the table. If a 
profile shows an elevation on any of the 
personality components, the higher the 
T-score is, the greater is our suspicion of 
an actual, behavioral deviation in respect 
to the syndrome indicated by the par- 
ticular scale on which the elevation oc- 
curs. Whether high T-scores on N (2.e., 
low raw scores) indicate a likelihood of 
abnormality in the absence of any signifi- 
cant elevation on the personality com- 
ponents proper remains to be seen; that 
is, an elevated N-component by itself is 
not prima facie suggestive of abnormal- 
ity, as is obvious from the method of 
scale derivation. 

This procedure has what may appear 
to be an unfortunate consequence in that 
with a mean of around 9g0 to 35 and a SD 


of 10, the upper limit of possible T scores 
is fixed by the fact that no raw score is 
negative, so that no one can get a T score 
above 80. This fact is comparable to the 
lower limit of possible T scores that ex- 
ists on the MMPI scales, and is related to 
the nature of the items which yields a 
marked positive skewness on these other 
scales. It is possible that with further re- 
finement it will be found advisable to 
construct a set of T-scores based upon 
the distribution of normal persons with 
abnormal profiles, rather than upon the 
general unselected normal population. 
This is especially likely in view of the 
fact that the scale seems to have little 
significance by itself as regards the differ- 
entiation of normals from abnormals, in 
which case the only rational application 
of it would be to the group (both nor- 
mals and abnormals) who score high on 
the personality components. If this were 
done one would not bother scoring N on 
any profiles except those showing a sus- 
picious elevation on one of the eight per- 
sonality components, and the T score 
would specify the position of such a per- 
son on the distribution of all “normals” 
with such deviant profiles. 

In the case of both males and females, 
the unselected normals achieve a mean 
score about half way between the means 
of the normal and abnormal criterion 
group means for their respective sexes. 
The relationships are shown in Table 6. 

In the case of the males, the criterion 








A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 29 


TABLE 6 


Mean T-Scores of Unselected Normals, Criterion 
Normals and Abnormals, Both Sexes 








Mean T-Score 





Group on N Scale 
Male unselected normals 5 
Female unselected normals 5 


° 
° 
: Male criterion normals 37 
Male criterion abnormals 56 
Female criterion normals ° 
Female criterion abnormals 58 





? abnormals average about 1.9 sigma 

: “above” their matched normals, in terms 

3 of the variability of the unselected male 
population; whereas the female criterion 

) abnormals average about 1.8 sigma 
“above” their matched normals, in terms 
of the variability of the unselected fe- 
male population. 

) | Combining the sexes after converting 
to T-scores, the following table of cumu- 
lative frequency for the entire group of 

: criterion normals shows their overlap 
with the unselected normal population 
(Table 7). 

TABLE 7 


Distribution of T-Scores for 42 
Criterion Normals 








Cumu- Cumu- 





Fre- Per lative lative 
T-Score quency Cent Fre- Per 

| quency Cent 
| 10<T S20 I 2 I 2 
20<T S30 8 19 9 21 
30<T S40 20 48 20 69 
40<T S50 6 14 35 83 
50<T S60 ” 17 42 100 





No case among the criterion normals 
obtained a T of over 60 on the N-Scale. 
The median T score for the entire group 
. | is 33.5, sexes pooled, and 83 per cent of 
| the cases do not exceed the mean of the 
unselected normals. 
| As regards the criterion abnormals, 

Table 8 gives the comparable percentages 
for them. No case among the criterion 
| abnormals obtained a T as small as 30 


on the N-scale, and only one obtained a 
T as small as 40, whereas 69 per cent of 
the normals fell at T <= 40. The median 
T of the entire 42 criterion abnormals 
was 59. 


TABLE 8 


Distribution of T-Scores for 42 
Criterion Abnormals 








Cumu- Cumu- 

Fre- Per lative lative 
T-Score quency Cent Fre- Per 
quency Cent 





T S20 ° ° 42 100 
20<T S30 ° ° 42 100 
30<TS4o I 2 42 100 
40<TS50 9 21 4! 98 
50<T S60 14 33 32 76 
60<T S70 16 38 18 43 
70<T S80 2 5 2 5 





DEVIATION OF THE “COLLEGE LEVEL” 
GROUP ON THE N SCALE 


In all of the work with MMPI it has 
been found necessary to take special pre- 
cautions regarding the college popula- 
tion and college-educated persons. The 
extent to which these persons deviate in 
their responses to many items from the 
general population is sufficiently great 
to necessitate elimination of items which 
on other grounds appear to be good dif- 
ferentiators for the variable being stu- 
died, but also unfortunately “differen- 
tiate” college people from people in 
general. In spite of the care exercised in 
this respect in construction of the per- 
sonality scales of the MMPI, it is still 
observed that college means differ some- 
what from those of persons in general. 
The typical college student, for example, 
shows a deviation below the mean on Hs 
and D, whereas he tends to show an ele- 
vation on Hy. It was suspected that the 
college (and apparently “college-edu- 
cated”) group might show considerable 
differences from people in general on 
the N scale, particularly when the appli- 
cation of N to some of those cases which 


>. 


~~ 


7 eee Ie 





30 PAUL E. 


originally aroused interest in the possi- 
bility of such a scale failed to reveal as 
high scores on N as would be suggested 
by their being clinically normal in spite 
of having deviant profiles, Accordingly, 
males and females of the college popula- 
tion were sampled in an effort to deter- 
mine the magnitude of this difference. 

For a sampling of 50 college males the 
mean raw score on N was 21.40 which 
corresponds to a T score of 59, not quite 
a full standard deviation above the mean 
for the general male population. The SD 
of these 50 male college students was 
8.938, as compared to 9.55 for the gen- 
eral unselected male normals. The dif- 
ference between these means is signifi- 
cant (t = 4.746 P <.o1) but the differ- 
ence between the standard deviations is 
not. 

For a similar sample of 100 college fe- 
males (from which, however, profiles 
showing any T = 70 except Mf has been 
excluded for another study) the mean 
raw score on N was 25.93, which corre- 
sponds to a T score of about 60, again 
about one SD above the mean of the gen- 
eral female normal population. If the 
cases with “abnormal” profiles are in- 
cluded, the mean rises slightly to 27.16, 
corresponding to a T score of 59. The 
SD of the entire group of 122 female col- 
lege students is 8.079 which in this case 
does differ significantly (P < .o5) from 
that of the general female normal popu- 
lation. 

The explanation of this rather pro- 
nounced deviation from the mean of the 
general population is not clear. The 
fact of this deviation, however, ‘is of 
great importance for two reasons.’ First, 
it has a bearing upon the interpretation 
of the N scale, of such a nature, as to 
dovetail with other evidence in ¢asting 
serious doubt upon any interpretation 





MEEHL 


of the scale in terms of Rosanoff’s “nor- 
mal” component, Secondly, it indicates 
either that the scale cannot be applied 
to college students or else that if so ap- 
plied it must be interpreted using a spe- 
cial set of norms. 

The frequency of college graduates 
among those test and criterion cases 
which seemed to be test “misses” (i.e., 
were out of the hospital in spite of ele- 
vated MMPI profiles but also showed an 
elevated T on the N scale) suggested that 
having been a college student might alter 
the likelihood of a high score even though 
the person was not still in college at time 
of testing. This suspicion was strength- 
ened by the fact that a group of college 
graduates studied with the MMPI in an 
industrial concern showed the usual col- 
lege pattern even though the age range 
was considerable. 

To test this hypothesis there were se- 
lected at random from a sample of pro- 
files obtained in this industrial concern 
the profiles of 50 college graduates (engi- 
neering) of age 25 or over, disregarding 
the profile itself. ‘The mean age of this 
group was 31.64 years, as compared with 
30.95 for the 100 unselected male nor- 
mals from the general population with 
whom they are to be compared. The age 
range was from 25 to 46, with a SD of 
5.102 years. We, therefore, have a group 
of males who are all college graduates 
and have been out of college an average 
of 8 years, and whose mean age is very 
close to that of the general male normal 
sample used in getting the tables for T. 

For this sample of 50 male college 
graduates the mean N ‘score (raw) was 
20.9, which corresponds to a T score of 
59, which is exactly that of the sample of 
college males still in school at time of 
testing. The SD of raw scores was 6.688, 
which differs significantly (P < .01) from 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 31 


the variability of the 100 unselected male 
normals from the general population, on 
which the T scores were based. The mean 
raw score of 20.9 differs significantly 
from the mean 29.14 of the unselected 
general population males, showing a ¢t 
of 3.10 (P < .o1). Unfortunately, a corre- 
sponding study for females out of col- 
lege for some years at the time of testing 
could not be made due to lack of con- 
venient case material. Considering the 
similar elevation of female college stu- 
dents, it is provisionally taken as prob- 
able that some degree of elevation at 
least would be maintained in later life, 
although this will be investigated when 
a suitable group of profiles is readily 
available for study. 

It appears from this evidence that the 
elevated T scores of college persons are 
not a function wholly of their age, nor 
of being “in college’ at the time of test- 
ing. The question then arises, is it legiti- 
mate and desirable to employ separate 
norms in the application of N to the 
college group (henceforth the “college 
group” will be used to designate college 
students and college graduates alike) or 
should the scale not be applied to them 
at all? 

If we had some way of knowing why 
the college group deviated from the gen- 
eral population, we would know how to 
answer this question. But since the rea- 
son for the deviation is not known, it is 
difficult to decide. The proposal has been 
made by certain college counselors that 
a T score of 60 on one of the personality 
components “means as much for a col- 
lege sophomore as a T of 70 would mean 
for a person from the non-college gen- 
eral poulation.” The rationale of this at- 
titude is questioned by those who are 
most intimately connected with the ap- 
plication of the MMPI in clinical diag- 


nosis. That a T of 60 may mean as much 
in terms of its statistical rarity among 
college sophomores as one of 70 in the 
general population is admitted and is 
quite trivial. The important thing to 
determine is whether in the long run one 
can expect a college student scoring 60 
on, say, Hs to have as many bodily com- 
plaints and be as refractory to modifica- 
tion as a non-college person with a score 
of 70. It must not be forgotten that scales 
such as those of the MMPI acquire what- 
ever non-statistical meaning they possess 
from the clinical description of those ex- 
treme deviates who make up the diag- 
nostic categories for which the various 
scales are named. “Hypochondriasis” re- 
fers initially to a certain behavior syn- 
drome found in a small number of neu- 
rotic persons; the burden of proof is 
upon anyone who assigns any other than 
a somewhat “watered” meaning of this 
term in describing milder deviates scor- 
ing in the “normal” although elevated 
range. If it is actually the case that col- 
lege students and persons who have grad- 
uated from college exhibit significantly 
less body-consciousness and somatic com- 
plaining than the general population, 
there would seem to be no good reason 
for applying a special set of norms to 
college people based upon their own 
group. 

In the case of the present scale this 
reasoning is not so straight-forward. ‘The 
function. of the N scale is not to measure 
a “personality” component which is of 
intrinsic interest to us (although it may 
ultimately turn out to do so), but rather 
to distinguish persons showing deviant 
profiles but psychiatrically adjusted from 
persons showing equally deviant profiles 
who are psychiatrically maladjusted. 
Since we do not know how the scale 
works (i.e., the psychodynamics of re- 


ng ge 


(a he ome ea ape 
See 





eg 


if 


oe 





32 PAUL E. 


sponding to items which bring about 
such discrimination as empirically oc- 
curs) it is not clear whether separate 
norms would facilitate this in the case of 
college students or not. Actually it would 
be necessary to make a separate “test” 
study of college level abnormals and nor- 
mals to see whether college normals with 
deviant profiles score relatively higher 
raw scores on N than abnormals, in 
which case the use of separate college 
norms would be profitable in practice. 
This has not been done, mainly because 
of the great difficulty of selecting cases 
from the abnormal files on the basis of 
level of educational attainment. 

Nevertheless, it is at least suggestive 
to consider the performance of the col- 
lege “normals” who deviate on the other 
personality components of the MMPI 
in terms of their N scores as related to 
college group means and _ variabilities. 
For 22 college girls showing any T score 
of 70 or greater (excluding Mf) but 
“normal” so far as known, the median 
raw score on N is 32, which is at about 
the 86th percentile for a sample of 100 
female college students, and the mean 
raw score is 32.73 which is approximately 
one SD above the female college mean. 
Such a score, on the other hand, would 
be about one-third of a SD below the 
mean raw score of women in general 
(T = 59). 

Among a group of over i100 employed 
male college graduates there were found 
g persons who showed a T score of 70 
or greater on any of the eight MMPI 
personality scales excluding Mf. These 
persons, while probably not so well ad- 
justed as the non-deviant cases, ‘were at 
least “normal” enough to make an ade- 
quate occupational adjustment at a fair- 
ly high level and stay out of the hands of 
a psychiatrist. Their mean raw score on 





MEEHL 


N was 28.67 which corresponds to a T 
of 50 for the general population. How- 
ever, in terms of the distribution of N 
scores for the entire group of employed 
college graduate men in which they oc- 
curred, this mean is over 1.1 SD above 
the latter’s mean, and would be equiva- 
lent to a T score of about 39 if separate 
norms were used. This latter T score 
is not far from the T of 37 found in the 
original criterion group of male normals 
showing abnormal profiles. 

Further evidence regarding the de- 
sirability of separate norms for college 
persons will have to be collected. Until 
this evidence is forthcoming, however, 
the same norms will be used in order not 
to favor the scale unduly; however, it 
will at least be pointed out in what fol- 
lows that in several cases “misses” are 
not clearly ‘‘misses’’ because the cases in 
question would not be misses had the T 
scores on N been used been based upon 
special college norms. 


APPLICATION OF N-SCALE TO A “TEST” 
GROUP OF NORMALS WITH ELEVATED 
NEUROTIC TRIAD SCORES 


As a further study of the differentiat- 
ing power of N it was applied to a new 
“test” group of deviant-profiled normals 
from two sources—a file from a group of 
WPA workers and a set of scores con- 
tributed by a medical detachment at Fort 
Snelling Hospital. While the WPA group 
might be thought atypical to such an ex- 
tent as to be unusable for such pur- 
poses, the findings on the ‘other MMPI 
scales and the item frequencies for this 
group do not indicate that they differ 
in any appreciable degree from the gen- 
eral population in those aspects of per- 
sonality sampled by the inventory. 

Since the derivation of N was based 
upon criterion cases showing elevations 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 33 


on the neurotic triad Hs, D, and Hy, 
the ideal test cases should be those show- 
ing these elevations, Only a small num- 
ber of such cases was found in the WPA 
and army groups combined, namely, a 
sample of 22 male individuals who were 
making an adequate adjustment psychi- 
atrically as far as was known, but who de- 
viated to the extent of T = 7o on Hs, 
D, or Hy. Six cases were included which 
probably should have been left out on 
the basis of information subsequently 
collected; three of these were found to 
have been “under a doctor’s care” (com- 


As may be seen from an inspection of 
the distribution, there is a gap from 43 
to 50 above which lie seven cases, of 
which five belong to the six cases men- 
tioned above as possibly deserving of ex- 
clusion. The one case with T = 64 is a 
college graduate and was under a doc- 
tor’s care when tested. The case at T = 
56 and that at T = 50 are both college 
graduates. The cases at IT = 52 and 
T = 55 were both “under a doctor’s 
care” at time of testing. Two of these 
three college graduates are indicated on 
the attached record as “nervous,” and 











Xx 
x 
x XK xX XK KK x x 
x KXKKKK KKKKKK K KK X xX 
30 40 5° 60 70 
O O OO O O 
OO OO OO O00O e® 6&« OS © 
3° 40 50 60 70 
T-Scores 


xX Abnormal (in hospital) 
O Normal (outside hospital) 


@ Outside, college graduate 
@ Outside, under doctor’s care 


Fic. 2. Distribution of T-scores for 22 male ‘‘test’’ normals with elevated neurotic triad but out of 
the hospital; compared with scores for 27 test abnormals in the hospital with diagnased psychoneurosis 


and elevated triad. 


plaint unknown) at the time of testing 
(a fact which would exclude them from 
the “normal” category according to the 
standards on which the MMPI was origi- 
nally constructed) and the other three 
cases were college graduates. 

The distribution of these cases is 
shown in Fig. 2. The distribution placed 
above it will be discussed later. The 
median T score of these 22 “test nor- 
mals” with elevated neurotic triad but 
out of the hospital was 40.5, and the 
mean T score was 42, with a SD of 9.585. 
So this test group falls at £8 to 1 SD be- 
low the mean of the general population, 
as compared with the criterion normal 
males at about 1.3 SD below. 


the one with T = 56 was unmarried at 
age 45. With these considerations in mind 
it may be said that the “test” group re- 
sults compare rather favorably with the 
results of the original criterion cases. 

Had the six doubtful cases been ex- 
cluded to be more lenient in judging the 
N scale instead of rigorous, the median 
T score would have been 37, at the mean 
of the original male criterion normals. 
The mean T score in that case would 
have been 37.8, again very close to that 
of the criterion normals. 

Another way of looking at these same 
data would be to convert the scores of 
the three college cases into appropriate 
T scores assuming that separate college 





es 


oy oe orca” 








34 PAUL E. 


norms should be used. In this case the 
college cases with T’s of 64, 56, and 50 
would instead score T’s of 56, 48, and 41, 
respectively. Using these transformed 
scores instead of the original ones, the 
median T score of the entire 22 test nor- 
mals would be at 40. Only 5 cases would 
now have a T above the mean T of 50, 
and no case would have a T score above 
55. (See Table g.) 


TABLE 9 


Frequency Distribution for 22 Male ‘‘Test”’ 
Normals from WPA and Army Medical Detach- 
ment Group, Showing Elevated Neurotic 
Triads. The College Cases Have Been 
Expressed in Terms of 
College Norms 








Cumu- 





Cumu- 
Fre- lative Per lative 
T Score quency Fre- Cent Per 
quency Cent 
20ST S29 3 3 14 14 
30ST S39 7 10 32 46 
40ST S49 7 17 32 77 
s0oSTS59 5 22 23 100 
60ST S69 ° 22 ° 100 





It seems justifiable from the preceding 
analysis to state that at least so far as 
the neurotic triad is concerned, a “nor- 
mal” person who shows an elevated pro- 
file but manages to stay out of psychi- 
atric hands will average a T score in the 
neighborhood of 40 or about one SD be- 
low (one SD above in raw score terms) 
the mean of the normal population. As 
the T score of a deviant-scoring profile 
rises above 50 the chances of its owner 
being free of psychiatric involvement to 
the extent of hospitalization become pro- 
gressively smaller; whereas a T score as 
high as 60 will make these chances very 
slim. Putting it in another way, in terms 
of whatever component the raw scores 
on N measure, a person on the average 
needs an amount of this component 
about 1 sigma above the mean of the 
general population if he is to: remain 





MEEHL 


“normal” in spite of elevated scores 
(IT = 70) on the neurotic triad. As one 
reaches and then begins to fall below 
the general population raw score mean 
on this factor, the odds increase that one 
is not a member of the “normal” group. 
If one scores above 70 on Hs, D, or Hy 
and falls as much as one SD below the 
mean of the general population, the 
chances that he is psychiatrically in- 
volved are high. 

It should be pointed out again at this 
time that the great crudity of division 
into “normal” and “abnormal” on the 
basis of the only available objective cri- 
terion “intramural” or “extramural” 
with reference to a psychopathic ward 
inevitably worsens the apparent differ- 
entiating power of a scale, particularly 
one such as the present. Although prac- 
tically all persons who are “in” a mental 
hospital with psychosis or psychoneu- 
rosis deserve to be there, the converse is 
not true for those who are “outside.” De- 
tailed study of the individuals used in 
testing such a scale would almost invari- 
ably work in its favor by finding inde- 
pendent evidence of psychiatric difficul- 
ties in persons otherwise counted as “‘ex- 
ceptions” to a rule. As an example of 
this I might mention one of the few cases 
in which a person who would be counted 
as such a test “‘miss’’ was accessible to me. 
I was shown a profile of a college student 
with an elevation in the region of T = 
go on the Hy scale, with Hs, D, and Mf 
also up but not above 7o. I scored his 
test on the N scale and found to my dis-' 
comfiture (this being before the extent 
of college atypicality on N was realized) 
that his N score was about 65, contrary 
to a provisional hypothesis that 60 was 
a rough limit of T scores on N to stay 
out of the hospital if one had a deviant 
profile on one of the other scales. I 

















A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 35 


mentally recorded this case, which I had 
come across more or less accidentally in 
a social way, as a “miss” for the N-scale. 
Shortly thereafter I had occasion to meet 
the individual in question personally, 
without his knowing that I had seen his 
MMPI profile. In the course of conver- 
sation I inquired as to his draft status. 
He informed me that he had been in the 
army but had been discharged for “func- 
tional asthma.” I asked whether he had 
any other ailments, and he went on to 
describe the terrible headaches he had so 
frequently for which the doctors could 
find no explanation, the backaches which 
were equally troublesome, and the fact 
that he had had insomnia and was quite 
a “nervous person.”’ Had this opportunity 
for direct clinical contact not presented 
itself, this case would have appeared 
merely as a misplaced X on a frequency 
distribution and called a “miss” for the 
scale. 

In the evaluation of the results apply- 
ing N to the “test” group of normal 
deviates on the neurotic triad, one needs 
to consider the fact that in the original 
criterion group of abnormals, the pro- 
files were in a sense “artificially” held 
down by the nature of the matching 
technique involved. That is, due to 
the rarity of extremely elevated T 
scores among normals, the matched 
profiles of the criteron abnormals were 
simply not allowed in the matching 
process to attain the great elevations 
(T = go and up) which do occur in the 
hospital neurotic population. In a cer- 
tain sense this is of no consequence, 
inasmuch as if it is established that T 
scores above this upper limit almost 
literally never occur in a valid normal 
profile the problem of distinguishing 
“normals” with such elevations does not 
exist. That is, that a scale such as N 


should “work” only within the semi- 
pathological range, say, of T = 70 to 
T = go (which includes two full stand- 
ard deviations) would be no cause for 
complaint if everyone scoring above go 
were practically certain to be pathologi- 
cal. It might be objected that the normal 
curve integral does not decrease very 
rapidly between a score at two sigma and 
a score at four sigma. This statement has 
a superficial plausibility, but it is based 
upon the erroneous assumption that the 
“probabilities” in question are those of 
a normal curve of distribution for the 
“normal” population. The point must 
be clear that the “probability of a man’s 
being correctly called abnormal from 
his profile alone” is not a function of the 
normal distribution alone but depends 
to an equal degree upon the distribution 
of scores for abnormal persons and de- 
pends most of all upon the actual rela- 
tive frequency of normals and abnormals 
as defined by the non-test criterion. The 
solution of this problem, which is really 
one of inverse probability, requires the 
application of Bayes’ Rule and as usual 
the relevant knowledge for this applica- 
tion is inaccessible to us in practice. In 
point of fact, if one calls a case “abnor- 
mal” (by some arbitrary non-test dividing 
criterion) whenever one finds a T above 
the 5 per cent level, he will not be right 
95 per cent of the time by any means. 
The relative frequency of clinically nor- 
mal and abnormal persons is such that 
the best guess is probably “normal” with 
any T score between 70 and 80. There 
may only be 5 persons in 100 who score 
above 70 on Sc among the normal popu- 
lation; but there are much fewer than 
5 schizophrenics in 100 among the gen- 
eral population. 

To illustrate this by a specific example, 
consider a population in which the rela- 


36 PAUL E. 


tive frequency of actual schizophrenics 
(defined, say, by the presentation upon 
careful psychiatric examination of symp- 
toms so marked that the examiner would 
be willing to attach this as a diagnosis) 
is as high as 5 per cent, which is, of 
course, much higher than really exists. 
Assume that it is as rare for an actual 
schizophrenic to hold his T score on 
Sc down to 70 as it is for a “normal” 
person to achieve a score as high as 70 
or above, say, again 5 per cent. Then if 
one were to draw a sample of persons at 
random from such a population and 
label every case a schizophrenic who 
showed a score = 70, he would not be 
right by any means in 95 cases out of 
100 but in about 67 out of 100. The 
“probability that a man scoring as high 
as 70 is a schizophrenic” is only .67, and 
1 in g times such a diagnosis will be 
erroneously applied to a man who is in 
fact “normal.” 

For this reason it is quite reasonable 
to concern oneself with the group of nor- 
mals scoring between: 70 and, say, go, 
as of sufficient frequency to justify study, 
even though the mathematical properties 
of the normal curve (which do not apply 
to MMPI scales for the most part any- 
way) are such as to indicate to a super- 
ficial examiner the practical uselessness 
of studying the group in that range. Out 
of 1000 “normal” cases in the files one 
may expect to find, say, 50 profiles show- 
ing scores above 70 on any given scale, 
and yet not find a single case (except in- 
valid profiles with high F) scoring above 
go. As was pointed out earlier in the 
introductory remarks, it is this group of 
50 persons at whom the present study is 
aimed. 

Nevertheless, the mean profile of the 
criterion abnormals is almost completely 
within the range of two standard: devia- 





MEEHL 


tions usually called “normal,” and, there- 
fore, it is worth while to study the N 
scores of a group of “test” abnormals of 
appropriate neurotic triad diagnosis and 
test scores whose profiles are allowed to 
vary freely into the higher ranges. ‘The 
distribution of these abnormals must 
be compared with that for the 22 test 
normals we have just finished consider- 
ing in terms of their relation to the gen- 
eral population norms alone. 

From the abnormal file were selected 
at random 27 male abnormals with an 
elevated neurotic triad (Hs, D, or Hy = 
70) and a clinical diagnosis in harmony 
with such a configuration in that the 
diagnosis was required to be _ psycho- 
neurosis, either hypochondriasis, hys- 
teria, reactive depression, or mixed type. 
The group thus selected turned out to 
consist of 11 hypochondriacs, 8 hysterics, 
6 psychoneurosis mixed, and 2 reactive 
depressions. Cases with ? or F scores over 
70 were not allowed. This group of “‘test”’ 
normals is also stringent in its require- 
ments for the present scale, because 
whereas deviations in the upward direc- 
tion were unrestricted in this group, all 
cases used as test abnormals showed ele- 
vations on the neurotic triad, whereas 
a number of hospitalized abnormals with 
such a diagnosis fail to do so. 

The mean raw score on N for this 
group of abnormals was 26.44, which 
corresponds to a T score of 53. The SD 
of the raw scores was 7.057. The mean of 
53 is about 1/3 sigma below the mean T 
score of 56 found in the male criterion 
abnormals, whose profiles were “held 
down” artificially by the matching tech- 
nique; but the difference is not signifi- 
cant (t = 1.507, p > .10). 

The distribution for these 27 random 
and unrestrained profiles was presented 
in Fig. 2 for comparison with the distri- 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 37 


bution of N scores for the 22 male “‘test”’ 
normals. It can be seen that the overlap 
is considerable, 6 of the test normals 
reaching or exceeding the median of the 
abnormals. The separation of this test 
series should be compared with the sepa- 
ration shown by the distributions for the 
criterion normals, It should be pointed 
out again that the highest scoring seven 
test normals include three college gradu- 
ates one of whom was under a doctor’s 
care and two other persons under a doc- 
tor’s care for unknown ailments. With 
this in mind the overlap does not appear 
quite so great. 

In spite of these handicaps, the differ- 
ence between the means of these two dis- 
tributions is 10.37 raw score points, which 
is statistically significant (t = 4.268, P< 
.01). It can be said with confidence that 
even if the matching process did hold 
down the T scores of the criterion ab- 
normals to a slight degree still it did 
not hold them down to such an extent 
that the differentiation is removed when 
the profiles are not so restrained. 

A comparable test study with females 
could not be carried out because of the 
very small number of female normals 
with elevated neurotic triads available 
other than those already used in the cri- 
terion study. There were found only 
seven female cases in the WPA sample 
with elevated triads, of whom one was 
a Negro college graduate (social worker) 
and another was under a doctor's care, 
considered “nervous,” and _ separated 
from her husband at the time of testing. 
This left only five cases for whom both 
clinical normality and applicability of 
norms could be reasonably assumed. The 
mean T score of all seven cases is at 50 
—the mean of the general population. 
If the college case is changed to college 
group T-score, the mean would be 48. 


Whether female test cases would show a 
differentiation better or worse than the 
males cannot be predicted upon present 
evidence. 


APPLICATION TO CASES WITH ELEVA- 
TIONS OTHER THAN ON THE 
NEUROTIC TRIAD 


Due to the small number of deviant 
profiles available for study in the normal 
file, it was not possible to apply the N- 
scale to several new “test” groups of typi- 
cal normals with deviant profiles on 
the neurotic triad, as would have been 
the best procedure. However, it will be 
remembered that in the original selec- 
tion of deviant scoring normals, study 
was confined to those showing deviations 
on the neurotic triad, Hs, D, and Hy. 
The remaining set of deviant scoring nor- 
mals constitute a possible “test” group, 
although they have the disadvantage of 
testing two hypotheses at once. For if the 
N-scale fails to discriminate them in the 
desired direction, it would be unclear 
whether this was because too many of the 
original items were merely selected due 
to sampling errors, or whether the same 
“suppression variables” are simply not ap- 
plicable to the other scales as to the neu- 
rotic triad. 

In order to get a sufficient number of 
cases it was necessary to relax somewhat 
the previously stringent restrictions as 
to validity. In this group, F scores were 
allowed to rise above 60 before discard- 
ing a profile as invalid, although no case 
retained had an F over 66. There was 
also an examination of the profile as to 
its “form,’’ which is, for persons ac- 
quainted with the clinical application 
of the MMPI, a good indicator of 
whether the F is “validly” high or an in- 
dicator of confusion or non-cooperation. 
(Of course, these procedures were ap- 





Tae” * 
ms ets aera 
~—s, . 


ee 





38 PAUL E. 


plied before any determination of the N 
score was made on the cases.) The result 
of this screening was the selection of 21 
“normal” males showing at least one T 
score equal to or greater than 70, but 
not on any of the three neurotic triad 
scales, and 21 “normal” females with 
similarly deviant profiles. (The total 
number of “test” cases is 42 by pure coin- 
cidence, and it must be clearly under- 
stood that none of these 42 test cases is 
among the 42 of the criterion group used 





MEEHL 


per cent, which compares very favorably 
with a corresponding 83 per cent for the 
criterion group. Here also, no case was 
found with a T greater than 60 on the 
N-scale. The summary of these cases is 
presented in Table 10. 

In spite of the fact that the central 
tendencies of the T scores on N rise 
when the scale is applied to the new 
group, the relative rarity of scores above 
the general population mean of 50 is 
maintained, This would suggest a fair 


TABLE 10 
Distribution of T-Scores on N-Scale for 21 Male and 21 Female ‘‘Test”’ 


Normals Showing Elevated Profiles Other Than on Hs, D, or Hy 

















Frequency Cumulative Per Cent 
N eis hn 
Males Females Both Males Females Both 

T S20 ° ° ° ° ° ° 
20<T $30 3 ° 3 14 ° 7 
30<T S40 6 8 14 43 38 40 
40<T S50 II 9 20 95 81 88 
50<T S60 I 4 5 100 100 100 
60<T S70 ° ° ° 100 100 100 





in item selection.) Due to the relaxing of 
the F restriction, three cases with ele- 
vated neurotic triad scores were included 
in this group who had previously been 
excluded from the criterion normal sam- 
ple. The other 39 test cases deviate on 
non-triad components only. 

The mean raw score on N of the 21 
male test cases is 38.09, SD 7.993; for the 
21 females the mean raw score is 41.10 
and the SD is 6.89. These means corre- 
spond to T scores of 41 and 44, respec- 
tively. So we see that in both sexes, the 
mean T score of the test cases rises about 
4 points from that of the criterion nor- 
mals on whom the item selection was 
based. The median T score is 41 for 
males, and 44 for females; 42.5 for the 
sexes pooled. The shift seems to be from 
the region below T = 40 into the region 
between 40 and 50. The proportion of 
cases which do not exceed T = 50 is 88 


applicability of the scale to cases other 
than the triad deviates, were it not for 
the fact that the raw score on the N scale 
correlates to a rather marked degree 
with Pt and Sc and to a lesser but still 
positive extent with some other scales in 
the normal as well as abnormal popula- 
tion. For this reason the mere presence 
of deviation on these other components 
tends to be associated with a deviation 
on N (appearing as a low T score) regard- 
less of whether the person measured is in 
the hospital or out. Therefore, the tend- 
ency of the present group to score low on 
N does not establish its utility as applied 
to cases other than those elevated on 
the neurotic triad, until we investigate 
the magnitude of a similar lowering ef- 
fect on the cases who are in the hospital. 
When this is done results are not nearly 
so encouraging, and in fact indicate that 
while stable statistical differences exist, 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 39 


the overlap is too great to make the scale 
of any appreciable practical value in such 
cases. 

A group of profiles for 25 males and 25 
females was extracted from the file of ab- 
normals, the requirement being that the 
profile must deviate on some other varia- 
ble besides the neurotic triad, that the 
peak scores must be on some other varia- 
ble, and that the clinical diagnosis must 
not be hysteria, hypochondriasis, or de- 
pression. Profiles with ? or F scores above 
70 were also excluded as probably in- 
valid. ‘These cases were of various diag- 
nostic groups, including psychoneuroses 
other than those mentioned above, 
manic-depression, schizophrenia, psycho- 
pathic personality, paranoid condition, 
behavior problem, and simple adult mal- 
adjustment. The sex difference which 
appears in the goodness of separation is 
such as to warrant separate treatment. 

For the 25 male abnormals the mean 
raw score was 29.92, corresponding to a 
T score of 46, and a median raw score 
of 31 (T = 48). For the 25 female ab- 
normals the mean raw score was 29.92 
corresponding to a T score of 56, and a 
median raw score of 31 (T = 55). These 
comparisons are presented in Table 11. 

In spite of the significance of these dif- 
ferences from a statistical point of view, 
the overlap is excessively great and the 


differentiation, while indicating some- 
thing of theoretical interest regarding 
the generality of at least part of the N 
scale as a suppressor, is not sufficient to 
be of practical utility. Inspection of the 
plotted distributions (not presented here) 
indicates that the difference of the sexes 
lies in the much greater tendency of the 
male abnormals to achieve low T (high 
raw) scores than is the case with the fe- 
male abnormals. As regards the normals, 
in both sexes the “test’”” normals tend to 
remain below the population average; 
and this tendency appears also when they 
are compared with the central tendency 
of the 50 abnormals. In the latter com- 
parison, we find only 14 per cent of the 
female normals reach or exceed the me- 
dian T score of the abnormals; and 19 
per cent of the male normals reach or 
exceed the median T score of the abnor- 
mals. But at the other end of the distri- 
bution whereas only 8 per cent of the 
female abnormals fall as low as the me- 
dian of the female normals, we find that 
fully 1/3 of the male abnormals fall as 
low as the median of the male normals. 
In other words, if one has a T score 
above 50 on the N scale with any nontriad 
component deviating to a T of 70 or 
more, he is likely to be abnormal. If one 
has a T above 60 on N in such a case, he 
is practically certain to be abnormal. But 


TABLE II 


Comparisons of the 42 ‘‘Test”” Normals with Non-Triad Elevations and 50 
Random Abnormals with Non-Triad Elevations 














; Mean 
G = —o Mean T Raw ie, P 
roup aw aw Snel Shhee: diff 
Score Score . 
Diff 
Males 
21 “test’’ normals 38 38.10 41 
r 5.50 2.135 .05 >P>.02 
25 abnormals 31 32.60 46 
Females 
21 “‘test’’ normals 4! 41.10 44 
11.18 5.110 P<.o1 
25 abnormals 31 29.92 56 





ay 
bd 
* 








40 PAUL E. 


a low T score, at least for males, by no 
means guarantees that one will not be ab- 
normal, whereas it does seem to afford 
some such assurance in the case of fe- 
males. 

The net result of the investigation of 
this test group can be summed up as fol- 
lows: The mean T-scores on N of “nor- 
mals” showing elevations on other scales 
than the neurotic triad is somewhat 
higher than that of the criterion group 
of normals. The appearance of T scores 
above 50 is equally rare in the test group, 
as is also the case for scores above 60. The 
incidence of scores in the 40 to 50 range 
increases over that of the criterion nor- 
mals. The differentiation of normals 
from abnormals is not nearly so good as 
this finding might suggest, because ab- 
normals with deviations and diagnoses 
other than on the neurotic triad obtain 
considerably lower T scores on N than 
the criterion abnormals and are in the 
case of the male abnormals actually be- 
low the mean T of the general popula- 
tion. There is a statistically significant 
difference between the mean T scores 
of normals and abnormals with non- 
triad elevations, but the overlap is great, 
particularly in the case of the males. It 
does not appear that the amount of dis- 
crimination is sufficient to have any hope 
that the N scale will be of practical util- 
ity for these groups in its present form. 
The statistical stability of the differentia- 
tion, however, suggests that further re- 
finement, especially in the removal of 
those items which contribute most to the 
high correlation with some of the other 
scales, might improve this state of af- 


fairs. It is, of course, very difficult for 
| 





MEEHL 


those items which must be operative in 
producing such discrimination as does 
exist to show up as clearly powerful dif- 
ferentiators when their influence is sys- 
tematically masked by items showing a 
high correlation with the scales for ab- 
normal components which are being con- 
sidered. That even with the existence of 
such high correlations with Pt and Sc as 
will later be indicated, the scale is able 
to produce a statistically although not 
practically significant differentiation sug- 
gests that a further search for a highly 
generalized suppression variable would 
be rewarded. 

It might be thought that the exclu- 
sion of cases showing T above 70 on ? or 
F from the “test” normals is undesirable 
since such profiles do occur in practice. 
The present position is that for this pur- 
pose (evaluating differentiation) every ef- 
fort should be made to include only 
valid profiles, and that the clinical sig- 
nificance of apparently invalid ? and F 
scores may be subsequently determined 
once the power of the scale is clearly 
seen. If the normal records called invalid 
because of their unusual form combined 
with suspiciously high ? or F scales were 
included, the differentiation would not 
become worse but on the contrary would 
have been greatly improved. A group 
of 26 cases originally excluded from the 
“test” normals because of invalid look- 
ing profiles and high F or ? scores earns 
a mean T score of 37.31 (38.64 for males, 
35-75 for females) which is considerably 
lower than the value for these means ac- 
tually obtained by the more rigorous 
selection. Among these 26 cases no pro- 
file shows a T above 50 on the N scale. 











CHAPTER V 


MISCELLANEOUS PROPERTIES OF N AND ITs RELATION 
TO CERTAIN OTHER VARIABLES 


N CONSIDERED AS AN INVERTED “LIE” 
SCALE IN TERMS OF THE SCORES OF 
A GROUP OF CLINICAL ABNORMALS 
SHOWING NORMAL MMPI 
PROFILES 


N view of the slight though insignifi- 
I cant difference between the N scores 
of abnormals with freely varying profiles 
and those with profiles “held down” 
artificially by matching, together with 
the negative correlations later to be re- 
ported between the two chief “lie” scales 
and raw score on N, it seemed possible 
that giving few responses scored on.N 
might be allied to the same tendency 
which gives rise to elevated lie scores. 
Furthermore, there are actually 7 of the 
22 items on L, (an unpublished lie scale 
in process of development) which appear 
on the N scale also, scored in the oppo- 
site direction in every case. One partial 
test of such a hypothesis is to determine 
whether the T scores on N of clinical 
abnormals showing “normal” profiles 
tend to be elevated. 

The file of profiles for hospitalized fe- 
male abnormals was entered at random 
and the first 35 consecutive cases failing 
to show any MMPI personality compo- 
nent over 65 were extracted. (Actually, 
the first 19 cases so selected showed no T 
score over 60 on any MMPI scale; but it 
was necessary to raise the boundary to 65 
in order to get more cases.) All cases with 
? over 58 were eliminated from considera- 
tion, since elevated ? scores frequently 
mean spuriously low profiles achieved by 
simply failing to answey large numbers 
of significant items. The clinical diag- 
noses of these patients were variable, in- 


41 


cluding psychoneuroses (all types), de- 
pression, psychopathy, schizophrenia, 
mania. The mean raw score on N for 
these 35 patients with “normal” appear- 
ing profiles was 26.63, and the SD 9.565. 
The T score corresponding to this mean 
would be about 59. Another group of 16 
female abnormals (this time showing no 
scores above 60) taken from files for re- 
cent patients shows a mean raw score of 
25.5, corresponding to a T of 60. For 
the entire group of 51 female abnormals 
with normal profiles, the mean raw score 
is 26.27, corresponding to a T score of 
60, and the SD is 9.594. This mean 
raw score differs significantly from that 
of the 100 unselected normal females 
(t = 5.515, P < .01). 

A similarly chosen group of 26 male 
abnormals showing no score over 60 
showed results in the same direction but 
of lesser magnitude. The mean raw score 
for male abnormals with normal profiles 
was 24.92, corresponding to a T for males 
of only 54, and a SD of 7.4519. This raw 
score mean is of borderline significance 
statistically, the ¢ in comparing it with 
the mean of 29.14 for unselected normal 
males being 2.079 with 124 d.f., so that 
.02 <P < .05. Why there should be a 
smaller difference in the case of males 
is not clear, nor is it clear why the comb- 
ing of over 300 male abnormal profiles 
and 500 female should have resulted in 
a rather pronounced difference in the 
pattern of frequencies for clinical diag- 
noses among those managing to hold 
their profiles down to “normal.” Over 
half of the females in this study were 
diagnosed psychoneurosis, 3 were schizo- 


42 PAUL E. 


phrenics, 3 psychopaths, 2 each of simple 
adult maladjustment, involutional de- 
pression, manic depressive depression, 
and 1 each of paranoid state, addiction, 
behavior problem, and manic. In the 
case of a sample of males taken in the 
same way (consecutive cases in the file 
showing “normal” profiles) there were 
only 5 psychoneuroses, 5 schizophrenias, 
7 psychopaths, 2 manic depressed, 2 para- 
noid state, 2 homosexual, 2 alcoholic, 
and 1 behavior problem. Quite possibly 
the difference in the relative frequency 
of cases which would be expected to show 
elevations on the neurotic triad is the 
major cause of the difference found be- 
tween the sexes in the functioning of N 
as a sort of “lie” scale. 

The median T score of the 51 female 
abnormals with normal profiles was 60, 
and only 14 per cent showed T scores 
below 50. In the case of the males the 
median was 53.5 and 27 per cent showed 
a T score below 50, 

In summary, it may be concluded that 
whatever the mechanism may be, the 
N scale functions to a significant extent 
inversely as a variety of “lie” scale not 
in the sense of indicating deliberate and 
conscious deception but some tendency, 
however produced, to avoid putting one- 
self in an unfavorable or abnormal light 
when answering the items on MMPI. 
Female abnormals who fail to show any 
of their personality components on the 
eight scales deviating by more than 1.5 
sigma (IT = 65) tend on the average to 
reach a T of 60 on the N scale. Male ab- 
normals with similarly “normal’’ profiles 
otherwise show the same tendency to ele- 
vated T scores on N but to a consider- 
ably weaker degree, averaging a score 
only about 3/10 of a sigma above the 
mean of men in general (T = 54). Only 
about 1 female abnormal in 6 or 7 can 





MEEHL 


hold her profile down to 65 on all scales 
and keep her N score below T = 50. 
About 1 male abnormal in 4 is able to 
achieve this, however. 

The occurrence of T scores in the re- 
gion 55 to 65 is, of course, so frequent 
among the general normal population 
that the present scale could not have any 
practical utility in the detection of lying 
when inspecting profiles for abnormality. 
It is only after being possessed of the in- 
formation that the patient is abnormal 
in spite of his apparently normal profile 
that it occurs to us to interpret an N of, 
say, 55 in such a way as to “explain” the 
low profile. The chief importance of the 
present finding regarding the function of 
N in a rough way comparable to L and 
L, is with reference to its ultimate the- 
oretic interpretation. 


RESULTS OF APPLICATION OF N TO 
UNSELECTED ABNORMALS 


Upon the Rosanoff hypothesis which 
originally suggested the present inves- 
tigation, one might plausibly argue that 
if the N scale were measuring even in a 
crude way this so-called “normal” con- 
trolling component of personality, the 
mean score of a group of persons defined 
simply by their psychiatric abnormality 
should be elevated (i.e., raw scores on N 
should be low). The elevation of T 
scores on N might not be expected to be 
as great for unselected abnormal per- 
sons as for those in the criterion 
abnormal group even on Rosanoff’s 
hypothesis. For since, as ‘was pointed 
out previously, the criterion abnormals 
had their MMPI profiles “artificially” 
held down by the matching process, we 
might in effect have been selecting these 
abnormals who were clinically abnormal 
in spite of fairly small amounts of devia- 
tion in the specific abnormal components 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 43 


which N is supposed to control. Such a 
group would be expected on Rosanoff’s 
hypothesis to be characterized by an ex- 
tremely small amount of the “normal” 
component, since such relatively slight 
deviations on the MMPI scales (assuming 
these scores for the moment to be valid) 
have resulted in their being psychiatri- 
cally incapacitated. 


male abnormals is actually higher than 
the corresponding raw score mean for 
the male normals, but this difference is 
not significant statistically. The vari- 
ances do not differ significantly by the F- 
test. 

Because of the sex difference these re- 
sults are difficult to interpret in any con- 
sistent way. It would seem that the fail- 


TABLE 12 


Comparisons of Means and Variances of 50 Random Abnormals 
of Both Sexes with the General Population Sample 

















Mean om Diff. 
Group N pes I SD of Wiad C.R. P 
Male normals 100 29.14 50 9.550 
1.00 .608 P= .23 
Male abnormals 50 30.14 49 9.457 
Female normals 100 35-59 50 9.804 
, 4-97 2.834 P<.o1 
Femaleabnormals — 50 30.62 55 10.284 








Following this line of reasoning we se- 
lected 50 abnormals from both sexes 
from the file of hospitalized patient’s pro- 
files. Cases with organic involvement or 
with diagnosis unrecorded or doubtful 
were excluded. No restrictions were 
placed on ?, L, and F in this sample, be- 
cause of another use to which it was in- 
tended to put the data. The diagnoses 
were, of course, very heterogeneous, as 
was intended, the largest single category 
being psychoneurosis. 

The means and the variabilities of 
these abnormals are given in Table 12, 
together with a test of the significance 
of their deviations from the unselected 
normal sample upon whom the T scores 
were based. 

We find that the mean raw score of 
the female abnormals is about 14 an 
SD below that of the female normals on 
which the T scores are based, and that 
this difference is statistically significant 
at the 1 per cent level. The mean of the 


ure to obtain any difference in the case 
of the males constitutes stronger evidence 
against any interpretation in terms of 
Rosanoff’s theory of control than the 
slight although stable difference in the 
case of the females constitutes evidence 
in favor of such an interpretation. In 
view of the other findings on the average 
amount of deviation found in abnormals 
in MMPI research, a deviation of .5 sig- 
ma does not lend much support to the 
suggestion that these female abnormals 
are clinically abnormal because they do 
not have enough of the hypothetical 
“control” or normality variable to stabi- 
lize themselves. It must be remembered 
that this T score of 55 is still almost 14 
SD nearer to the mean of unselected 
normals than the mean of female college 
students. The smallness of the deviation 
and the lack of any difference in the case 
of the male group lead us to doubt seri- 
ously (especially in conjunction with the 
evidence from intercorrelations and item 





1% 





44 


content to be discussed later) whether 
the results can be construed as support- 
ing the original hypothesis. That there 
is any deviation at all in the case of the 
one sex can be “explained” on simpler 
grounds than the Rosanoff hypothesis, 
as will be seen shortly. 


THE RELATIONSHIP OF N TO AGE AND TO 
THE OTHER VARIABLES OF 
THE MMPI 


Further evidence against any interpre- 
tation of N as a positive “normality” 
component of personality comes from a 
study of its relation to the other MMPI 
scales and to chronological age. For the 
100 random male normals on whom the 


PAUL E. 





MEEHL 


group (16-25) were medical students and 
such a college group, as pointed out pre- 
viously, tend to show low raw scores on 
N. 

The intercorrelation of the N scale 
with the other scales of MMPI and with 
a few unpublished or abandoned scales 
was also investigated. In Table 13 below 
are presented the correlation coefficients 
for raw scores on N with certain other 
measures on MMPI, for the sexes sepa- 
rately and for normals and abnormals 
separately. The variable C, is the no 
longer employed “‘correction scale’ for 
the old hypochondriasis key (10), and L, 
is a more subtle “lie” scale which is still 
in the process of validation. 


TABLE 13 
Correlation of Raw Score on N with Other Variables 








F Hs 





Pt Sc Ma 





Ch Le L D Hy Pd Pa 
100 male normals 9 =—.76 ~.17.4g «98 108 =—.8§ «39g 487 aR. 57 
100 female normals .78 .—.65 —.11 .28 .54 .46 Te a. he ee 
so male abnormals .82 —.82 —.28 .60 .55 .47 at. 6, 0 <a 9 
50 female abnormals .78 —.84 —.21 .42 .37. .56 ae 18 ag Se «ee get 





T scores were standardized, there is evi- 
dence of a progressive rise in mean N 
score with chronological age in the gen- 
eral population. For the age range 16-25 
the mean N raw score was 23.73, aS com- 
pared with a mean of 34.30 for the age 
group 36-45 in the same normal sample. 
For all 100 male normals there is a cor- 
relation of .g8 between chronological age 
and raw score, which is significantly dif- 
ferent from zero (P < .01), For the 100 
female normals this correlation is only 
.16, which is not statistically significant 
in its deviation from zero (P > .10). The 
means for females also do not show as 
much change, being 33.67 for the: young- 
est age group versus 38.80 for the’ oldest. 

The difference in the case of the males 
is in part due to the fact that about half 
of the normal males in the youngest age 


J 


Fer the 100 normals, all correlations 
above .20 are significantly greater than 
zero (P < .05); for the 50 abnormals, 
those above .27 are so significant (7). 
The correlations with F and L probably 
underestimate the intensity of associa- 
tion since the distribution of these two 
variables is so markedly skew that the 
conditions for the Pearson r being a 
maximally descriptive statistic of associa- 
tion are assuredly not fulfilled. The im- 
portance of an exact measure of covaria- 
tion was not such as to warrant the ap- 
plication of more appropriate descrip- 
tive statistics, however. 

Of particular interest in the table 
above are the findings which bear upon 
the psychological interpretation of the 
N scale. Although detailed discussion of 
this matter will be given in the following 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 45 


chapter, some emphasis may be placed on 
certain correlations here. Of particular 
interest are the following relationships: 


1. The high positive correlation of N with 
the old correction scale C,,. 

2. The marked to high negative correla- 
tions of N with L,, and the fact that 
these correlations are higher in the case 
of the abnormals. Also allied to this the 
negative correlations, although low, 
with L. 

. The high positive correlation of N with 
Pt and Sc. 

4. The relatively low correlation (signifi- 
cantly negative for normal males) with 
Hy, despite the positive association with 
Hs. 
The positive correlation with F. 
The general tendency to significant and 
in some cases marked positive associ- 
ation with the various abnormal tom- 
ponents, there being only one negative 
correlation in the table except for the 
lie scales. 


iS) 


oe 


The fact that N correlates low and in 
one group significantly negative with Hy 
and yet correlates fairly markedly with 
Hs suggested another correlational study. 
The Hy and Hs scales have many 
“somatic” complaint items in common. 
The Hy scale differs not only in that it 
does not share quite all of the Hs somatic 
items with the latter (which in itself 
would not account for this correlational 
difference), but more importantly the Hy 
scale contains a large number of non- 
somatic items, including those “denial 
of the psychiatric” responses that were 
mentioned in Chapter II. This group of 
items, as was pointed out there, have the 
common property of denying psychiatric 
involvement and assert oneself to be ex- 
traordinarily well adjusted socially. 
These items are responded to by the 
hysteric in the statistically “normal” 
direction, and hence are recorded “O” in 
scoring the MMPI. Examples of these 


items and the responses scored for Hy 
are: 


C-25 I have,often lost out on things because 
I couldn’t make up my mind soon 


enough. (F) 
D-52 I think most people would lie to get 


ahead. (F) 

E-43 When in a group of people I have 
trouble thinking of the right things 
to talk about. (F) 

F-54 I can be friendly with people who do 
things which I consider wrong. (T) 

F-25 I resent having anyone take me in so 
cleverly that I have to admit that it 
was one on me. (F) 

G-40 In walking I am very careful to step 
over sidewalk cracks. (F) 

H-10 I commonly wonder what hidden rea- 
son another person may have for 
doing something nice for me. (F) 


Because of the relations in the correla- 
tion table, plus the growing hypothesis 
(after abandoning Rosanofl’s) that the N 
scale was at least in part detecting a ten- 
dency to say psychiatrically undesirable 
things about oneself even if unwarranted, 
it was felt that there might be a correla- 
tion between N and these “I-am-psychi- 
atrically-fine’” responses found in_hy- 
steria, and that this correlation should 
be negative. Accordingly, the correlation 
was computed for a sample of 60 normal 
females and 60 normal males, disregard- 
ing their profiles. However, there are 5 
items out of the 18 which are common 
to Hy and N (all five incidentally scored 
oppositely on the two scales). To elimi- 
nate the effect of this overlap of ‘items 
another correlation coefficient was deter- 
mined, this time based on N versus the 
13 remaining items of Hy after the five 
overlapping were removed. The results 
of these calculations are in Table 14. 

These four correlations all differ very 
significantly from zero (P <.01) al- 
though they do not differ significantly 
from one another in passing from the 





> a 





46 PAUL E. 


TABLE 14 


Correlation of Raw Score on N with the ‘‘O” 
Items of the Hy Scale, for 60 Normals of 
Each Sex, Both with and ‘without 
the 5 Items Common to Both 








N Versus the N Versus the 





Group Entire 18 13 Non-Over- 

“O” Items lapping Items 
60 normal males —.72 — .68 
60 normal females — .69 —.59 





first to the second columns (P > .30). We 
observe that there is a marked inverse 
association between raw scores on N and 
raw scores on the non-somatic psychic- 
denial items of the Hy scale, and that 
this relation is only to a small or insignifi- 
cant extent a function of the actual item 
overlap between the two scales being 
correlated. This fact may be said to sub- 
stantiate the prediction which prompted 
the investigation of the relationship, and 
to some degree to support the hypothesis 
regarding the nature of N. 


THE ITEM OVERLAP BETWEEN N AND 
OTHER MMPI SCALES 


As is usual in MMPI scales, there is 
considerable overlap of items derived on 
N and on other scales. Of the 78 items 
scored on N, 42 also appear on at least 
one other scale of the eight personality 
components, whereas the remaining 36 
do not. Some of these remaining 36, how- 
ever, appear on the lie scales or on the 
old C, scale. In the case of overlapping 
items the mere number of common items 
is slightly misleading inasmuch as often 
the items are not scored in the same 
direction on both scales. Table 15 shows 
the number of items common to N and 
each of the other scales, and indicates 
the frequency of items in which the 
direction scored on N is the same or re- 
versed, 

For most of the personality scales 





MEEHL 


proper the overlap is quite small. The 
largest amount of item community for 
any of the personality components 
proper is only 7 per cent. In many cases 
it will be observed that the distribution 
of responses scored same and opposite is 
nearly equal. The most marked devia- 
tions from this trend are on Pt, C,, L,, G, 
and ‘“‘+.” In these five cases, every item 
on N is scored either consistently the 
same or consistently the opposite as it is 
on the other scale compared with N. In 
the case of the first three scales in ques- 
tion, the direction of scoring is in line 


TABLE 15 


Item Overlap between N Scale and 
Other MMPI Scales 
(Number of N Items = 78) 











Num- Num- 
Num- Com- ber ber 
Scale ber of mon Scored Scored 
Items Items the Oppo- 
Same site 
L 15 I ) I 
F 64 6 2 4 
Hs 33 3 2 I 
D 60 9 4 5 
Hy 60 9 3 6 
Pd 50 6 5 I 
Pa 40 2 I I 
Pt 48 7 7 ° 
Sc 78 4 2 2 
Ma 46 4 ° 4 
Mf 60 6 3 3 
Ch 48 13 13 re) 
Ca II 2 I I 
Le 22 7 ° 7 
G 62 16 16 ° 
+ 56 II II ° 





with the correlations found for the scales 
as a whole. Pt and C, are positively as- 
sociated with raw scores on! N, whereas 
L, is negatively associated. The signifi- 
cance of the item overlap with the un- 
published scales G and “+” is not clear 
because the psychological (and even clini- 
cal) nature of these scales is unknown, 
they having been derived by purely sta- 
tistical methods analogous to factor an- 
alysis without regard for any criterion 








A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 47 


or clinical significance. So far as any evi- 
dence exists regarding G and “+” they 
represent some sort of a general factor 
of abnormal answering (whether due 
to test-attitudes or personality factors 
proper is not known) which permeates 
the MMPI items rather generally. 


AN EXAMINATION OF THE TEN MOST 
POTENT ITEMS ON N 


There is not any very obvious psycho- 
logical interpretation of the whole group 
of N items which can be easily made 


from an inspection of their actual verbal 


content. As has been pointed out previ- 
ously, they do have the common property 
(aside from the differentiating one which 
was used to identify them) of admitting 
to personal traits, attitudes, and experi- 
ences which are not admitted by the ma- 
jority of the general population and have 
the character in most cases of being “un- 
desirable” or “maladjusted” responses. It 
is of interest to examine the ten most 
powerful differentiators as judged by 
their proportional differences in the 
criterion groups. These ten most dis- 
criminating items are as follows: 


TEN MOST POWERFULLY DISCRIMINATING 
ITEMS OF N, WITH THE RESPONSE 
SCORED FOR THE MORE 
“NORMAL” INDICATED 


C-35 I liked school. (F) 

E-49 It does not bother me that I am not 
better looking. (F) 

E-52 People often disappoint me. (T) 


F-7_ What others think of me does not 
bother me. (F) 

Often, though everything is going fine 
for me, I feel as though I don’t care 
about anything. (T) 

I am not easily angered. (F) 

A windstorm terrifies me. (T) 


I am not afraid of fire. (F) 
I have no fear of water. (F) 
I have no dread of going into a room 


by myself where other people have 
gathered and are talking. (F) 


E-52, F-47, G-28, H-35, H-36, H-41 do 
not appear on any of the personality 
scales proper of MMPI. Two of them 
appear on C, and one appears on L, 
(scored reversed). All ten of these most 
potent items are responded to in the 
statistically “unusual” fashion more often 
by the criterion normals than by the ab- 
normals; and as is apparent from reading 
them, they are all responded to by the 
normals in the direction which would be 
considered superficially indicative of 
mental ill health and maladjustment. 
The responses to H-35, H-36, and H-41 
are certainly very mysterious as indi- 
cators of “normality,” as are the others 
also to a variable extent. The writer is 
not prepared to defend any hypothesis 
as to the common property of these items 
psychodynamically, nor even to assert 
that they have a common property. There 
may be several “suppression” variables 
operating via these different items such 
that it would be a mistake to attempt an 
interpretation to hold for all of them. 
This question will be discussed further 
in Chapter VI. 


ITEM OVERLAP OF N WITH THE “NOR- 
MAL” COMPONENT OF THE HUMM- 
WADSWORTH TEMPERAMENT SCALE 


Since the “normal” component of the 
Humm-Wadsworth Scale was designed 
by the authors to measure the hypotheti- 
cal normality or control factor of Rosan- 
off which gave rise to the present investi- 
gation, it is of interest to inquire what 
relation, if any, holds between the 
Humm-Wadsworth normal scale and the 
present. There are no correlational data 
available at the present writing, but some 
idea may perhaps be gained by a study 
of item overlap between N and the nor- 








48 PAUL E. MEEHL 


mal component of the Humm-Wads- 
worth, Of the 78 items on N and the 38 
on the normal component of the Humm- 
Wadsworth, we find that there are seven 
items in common with almost identical 
wording. ‘The items in question are indi- 
cated with their item numbers in the two 
tests: 


N-scale Normal scale of 
from MMPI Humm-Wadsworth 
C-25 73 
D-13 109 
D-52 O4 
F-13 282 
G-27 28 
I-15 97 
I-30 279 


Although the number of items is too 
small to draw any definite conclusions, 
it is worth mentioning that in every one 
of these seven items, the response scored 
for “normal” on the present N-scale is 
opposite to the direction scored for “N”’ 
on the Humm-Wadsworth. It seems un- 
likely from this finding alone that the 
two are closely related, and it may be 
that their correlation is negative as would 
be suggested by this finding. 


A MINOR EXPERIMENT IN “BLIND” 
DIAGNOSIS UTILIZING N 


The distributions of criterion and test 
cases and the tests of significance indicate 
whatever amount of validity the N scale 
possesses for the discrimination of devi- 
ant-scoring normals from deviant-scoring 
clinical abnormals. However, a minor 
experiment was done involving an at- 
tempt to discriminate a group of profiles 
obtained from supposedly “normal’’ per- 
sons from a group obtained from hos- 
pitalized abnormals, using the N-scale as 
a basis. Three separations were of in- 
terest: | 


1, The separation obtained when the 


judge merely looked at the profiles and 
was asked to pick out the actual abnor- 
mals, in the absence of any N score on 
the profile. 

2. The separation obtained when the 
judge was in situation (1) except that 
the N-score had meanwhile been added. 

3. The separation obtained when the pro- 
files were mechanically sorted in terms 
of the magnitude of the N-score only, 
with no “judgment” based upon profile 
form and amount of elevation being 
permitted. 


In order to reduce the difficulty and 
the fatigue of making the separation, the 
number of cases employed was not great; 
and this fact increases the difficulty of at- 
taining statistical significance, by some 
tests at least. There were chosen at ran- 
dom (every third case, omitting college 
cases) from the entire group of deviant 
scoring normals (persons with any T = 
70 but out of the hospital) 22 cases for 
“test normals.” These were all males be- 
cause of the lack of sufficient test cases 
among the females, which would have 
resulted in too large a share of the nor- 
mals being members of the criterion 
group, thus unduly biasing results in 
favor of the scale. The profiles for these 
22 male “normals” showing deviant 
scores on at least one MMPI scale were 
then drawn by a third party. 

This third party also recorded profiles 
for 22 hospitalized abnormals taken at 
random from the hospital files, excluding 
cases with almost certain invalidity 
judged by elevated ? and! F. Since the 
writer, who was to be one of the judges, 
knew a priori that all of the “normals” 
would have at least one score T = 70 
(having been selected for that reason), 
it was necessary to include among the 
abnormals only those having abnormal 
profiles. The result is naturally to make 
the problem more difficult for the N- 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 49 


scale, since from the previous findings 
we know that abnormals with non-devi- 
ant profiles tend to score high T’s on the 
N-scale. All of the present 44 profiles, in 
other words, were “deviant” in that at 
least one of the eight personality scales 
showed T = 70. 

These 44 profiles were randomized by 
the third party to the experiment and 
presented simply as profiles with T-scores 
indicated at the bottom, but without any 
score on N having yet been recorded on 
them. Two judges were tried in the sepa- 
ration, and the separation of the first 
judge was, of course, not known to the 
second. One of the judges was Professor 
S. R. Hathaway, co-author of the MMPI; 
the other was the present writer. The 
information accessible to the judge, “be- 
sides the profile, was the statement that 
of the 44 cases, 22 were in fact the records 
of persons clinically abnormal and _ hos- 
pitalized; whereas the remaining 22 were 
profiles of persons who, so far as known, 
were psychiatrically “normal” and were 
not under a doctor’s care at the time of 
taking the test but were out in the com- 
munity, The judge was instructed that 
he must divide the 44 into 22 he guessed 
“normal” and another 22 he guessed “‘ab- 
normal.” This restriction as to how many 
could be put in each category was im- 
posed on Professor Hathaway because 
the writer, of course, knew the group to 
be divided evenly while he was making 
his judgments. 

There was a difficulty arising from the 
entire design which showed up in mak- 
ing the statistical analysis, namely, that 
the judges were sufficiently “good” at dis- 
criminating the normals from the ab- 
normals regardless of N that in order for 
a significant difference in proportion cor- 
rect to be algebraically possible with 44 
cases the discrimination using N would 


have to be practically perfect. 

After the classification had been made 
by Professor Hathaway using the profile 
without N, the writer made his classifi- 
cation without N. Then the N scale was 
added by the third party, and the writer 
attempted another separation. The 
second separation by the writer was done 
several days after the first, but it is 
possible that the earlier sorting (although 
its correctness was not checked) exerted 
its influence on the latter. Nevertheless, 
the writer was found in subsequent an- 
alysis to have made a total of 18 changes 
(nine exchanges) in the second sorting. 

For purposes of statistical analysis of 
the results of these three separations (two 
by the same judge but differing in that 
the N scale was on the profile in making 
the second separation) the null hypo- 
thesis was assumed to hold for a four- 
fold table. The goodness of separation 
was tested by the deviation of values in 
the four-fold table from this null hypo- 
thesis, using chi-square as the test of sig- 
nificance. On the hypothesis that the 
judges could not actually discriminate, 
the four entries in the table should be 
equal within errors of sampling. A 
fourth separation was carried through in 
which the entire set of 44 profiles was 
simply arranged in order of magnitude 
of the T-score on N, a median drawn 
between the 22nd and e2grd case so or- 
dered, and the top half of the cases me- 
chanically labeled as “abnormal” with 
the bottom half labeled “normal.” The 
division occurred at between T = 45 and 
T = 46, about one-half sigma below the 
general population mean. This latter 
fact probably reflects in part the arbi- 
trary exclusion of abnormals with non- 
deviant profiles, who occur in some con- 
siderable number in practice and would 
almost certainly have raised the median 











50 PAUL E. 


score on N had they been included. Suf- 
fice it to say that this fourth separation 
was used to compare the results of a 
purely statistical and mechanical use of 
N with the judgment of persons more or 





MEEHL 


cally significantly, at least with this 
number of cases involved) by adding N 
to the profile. The most important find- 
ing here is probably that the N scale 
used in a mechanical fashion does al- 


TABLE 16 
Results in the Four Separations in the “Blind’’ Diagnosis Experiment 














Significance 





Separation Caiet— Sigiase P of Deviation 
‘a Hudhawens without N-score 59 1.455 P> .16 Not significant 
2. Meehl, without N-score 68 5.818 = .025 Borderline 
3. N-scale, working mechanically 73 9.091 P< .003 Significant 
4. Meehl, judging with N-score avail- 
able 77 13.091 P<.oor Significant 





less “expert” with regard to the clinical 
interpretation of MMPI profiles. 

The results of these four separations 
are given in Table 16 below, in terms of 
the value of chi-square from the four- 
fold table as well as simply the per cent 
of “correct” judgments out of 44. 

We see that the first judge does not 
succeed in discriminating the normal 
from the abnormal profiles significantly 
better than chance would allow. The 
second judge does somewhat better, 
achieving a deviation from chance that 
has borderline statistical significance be- 
tween the 5 per cent and the 1 per cent 
level. The separation of the second judge 
is improved somewhat (but not statisti- 


most as well as the better judge when he 
makes use of it, and does better than 
either judge working in its absence. It 
should be pointed out that the second 
judge may have been aided somewhat 
unconsciously, both in his matching with 
and without the N-scale, by having seen 
a few of the profiles several months pre- 
viously in another context, although it is 
doubtful whether this could have exerted 
any marked effect. In any case, the results 
using N alone are of the greatest signifi- 
cance in the above table. Even including 
cases with non-triad elevations and arti- 
ficially holding all abnormal profiles up, 
it appears that N has sufficient differenti- 
ating power to be of practical value. 





CHAPTER VI 


PSYCHOLOGICAL INTERPRETATIONS OF N 


HE PREVIOUS treatment of the N- 
Peete in this paper has been confined 
almost entirely to a consideration of its 
statistical properties. Dynamic functions 
have been loosely attributed to the score 
in remarks such as “An abnormal female 
with all personality components under 
T = 65 cannot easily hold her T-score 
on N under 50” and similar comments; 
but no serious effort has been made to 
give a theoretical and psychodynamic ex- 
position of what N is behaviorally and 
how it produces the discriminations 
which it does. 

The present chapter is only a tentative 
effort in that direction. It is characteristic 
of research on the MMPI that very little 
attention is paid to the actual content of 
items, nor is it usually asked “How” 
patients of a given description come to 
respond in the way they empirically are 
found to do. In some cases, as in many 
of the items of the D-scale and all of the 
items on the Hs scale, the relation of 
the scored response to the clinical entity 
discriminated is “obvious,” i. e., one can 
see (or so he thinks) why a depressed or 
‘ hypochondriacal person would come to 
respond in the way indicated by the scor- 
ing key. In many if not most cases, how- 
ever, this relation is either tenuous or 
altogether lacking. In the case of N we 
are in the unfortunate position of having 
to reason backwards from the properties 
of the scale and, to a lesser extent, from 
the nature of the items, to a guess as to 
its psychological “nature.” From the 
findings in regard to other scales, it must 
be obvious that interpretations must of 
necessity be held with only very slight 
confidence in so far as they are based 


51 


upon study of the items. It needs also to 
be pointed out that all of the 78 items 
probably do not “function” in the same 
way to perform the empirical discrimina- 
tions found, so that further statistical 
and experimental study of N needs to be 
done in order to isolate, at least for 
theoretical reasons, the components of 
which it is very likely a composite. 

What we know of N experimentally is 
that it seems to enable us to distinguish 
persons who show deviant MMPI pro- 
files but remain clinically “normal” from 
persons with equally deviant profiles who 
do not. This tendency is well established 
in the case of the scores on the neurotic 
triad, but it also holds up statistically 
although not usefully for the other non- 
triad components. There is only a slight 
tendency (none for one sex) for it to dis- 
criminate between normals generally and 
abnormals, when the rest of the MMPI 
profile does not enter into consideration. 

It seems appropriate at this time to 
summarize the reasons which have ac- 
cumulated throughout the preceding 
pages for not considering N, at least the 
major part of it, as measuring the “nor- 
mality” factor hypothesized by Rosanoff 
and measured (purportedly) by the N- 
component of Humm and Wadsworth. 
I submit that the following findings ren- 
der an interpretation of N in these terms 
extremely unplausible: 

1. The responses scored for normality 
on the present N-scale are in the vast 
majority of instances the statistically ab- 
normal responses from the standpoint of 
what most persons say to these items. 

2. The responses scored for normality 
are with practically no exceptions re- 





: im 
% 





59 PAUL E. 


sponses which admit to personal traits 
and attitudes of the sort that are psy- 
chiatrically undesirable and indicative 
prima facie of maladjustment and un- 
happiness, assuming them to be an ade- 
quate description of the person. 

g. The raw score on N, which tends 
to be high among “normals” who have 
deviant profiles but low among abnor- 
mals, shows a marked and statistically 
significant deviation in the low direction 
in unselected college students and in 
adults from the normal population who 
have previously attended college. To 
reconcile this finding with the interpre- 
tation in question it would be necessary 
to assert that college students and col- 
lege-educated persons have for some un- 
known reason considerably less normal, 
inhibiting, and controlling components 
in their personalities than persons in 
general, and in fact that such college per- 
sons have even less of this normalizing 
factor than an unselected sample of psy- 
chotic and neurotic persons. 

4. The N-scale differentiates only to a 
slight extent between female abnormals 
and females from the general normal 
population, and it does not differentiate 
male abnormals at all. Regardless of the 
magnitude of the other specific compon- 
ents, on the Rosanoff hypothesis it would 
be expected that at least on the average, 
persons who have broken down psy- 
chiatrically would have less of the “nor- 
mal” component than an_ unselected 
group of normals. 

5. The raw scores on N correlate to a 
significant extent with the abnormal 
components of MMPI, both among nor- 
mals and among the abnormals. In every 
case but one these correlations are posi- 
tive, and in some cases (e. g., Pt) they are 
very high. There is no rational way to 
interpret such a finding in terms of a 





MEEHL 


positive “normal” component of the kind 
in question. 

6. The raw scores on N correlate high 
negative with a subtle lie scale, and to a 
lesser but still negative extent with the 
old lie scale. While this fact does not 
contradict the Rosanoff interpretation 
of N, still it remains quite unexplained 
on that basis. 

7. In the case of the non-triad com- 
ponents of abnormality, it has been 
pointed out that if a person shows devi- 
ations on these components and has a 
low raw score on N, he is almost certain 
to be psychiatrically incapacitated to the 
extent of being in a hospital. Whereas if 
he has high raw scores on N, he is by no 
means guaranteed against such a condi- 
tion, The hypothesis in question posits 
a relation working both ways—too little 
“control” will result in break-down in 
the presence of slightly elevated abnor- 
mal tendencies; but conversely, an excess 
of control is supposed to prevent the ap- 
pearance of abnormality even when such 
tendencies are present. 

8. There are seven items on the 
present scale which occur also, with 
slight modifications of wording in some 
cases, on the Humm-Wadsworth “nor- 
mal’’ scale. All seven of these items are 
scored in the opposite direction for “nor- 
mal” on the present scale from the way 
they are scored on the Humm-Wads- 
worth. To the extent that the latter scale 
discriminates between normals and ab- 
normals generally and functions in the 
way these authors describe it to, this fact 
argues against making an interpretation 
in terms of Rosanoff’s theory for the 
present case. 

For these reasons the interpretation 
of N in terms of a positive, dynamic per- 
sonality component of “normality” or 
“control” which inhibits or regulates 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 53 


those components of temperament which 
might otherwise result in psychiatric up- 
set must be discarded. This is not to be 
construed as meaning that Rosanoff’s 
hypothesis is false, nor that the Humm- 
Wadsworth normal component does not 
measure Rosanoff’s variable, but merely 
that the present N-scale cannot be inter- 
preted psychologically in that way. It 
would seem that if such a factor as 
Rosanoff’s does exist, somewhere among 
the 495 items studied one should find 
items loaded with it. And this latter may 
actually be the case, since the reasons ad- 
duced against such an interpretation of 
N apply to the scale “as a whole,” i. e., as 
manifested in the correlational and dis- 
criminating properties of the entire set 
of 78 items, which are probably not, as 
has been pointed out previously, psycho- 
logically homogeneous. It is conceivable 
that the minority of items which are 
“O” responses on the N scale are more 
akin to the kind of item that would 
sample Rosanoff’s hypothetical normal- 
ity factor. On the other hand, it may be 
that these items would turn out to be of 
the same nature as some of those on the 
Humm-Wadsworth—merely denials of 
symptoms of such severity or diagnostic 
portent as would hardly occur in normal 
persons. Thus, the first two items of N 
are what one would a priori expect to 
find on a “normal” scale: 


“I am troubled by attacks of nausea and 
vomiting.” (F) 

“I have had attacks in which I could not 
control my movements or speech but knew 
what was going on around me.” (F) 


If all of the items on N were of this sort, 
the scale would be trivial and thoroughly 
uninteresting theoretically, but it would 
not be difficult of interpretation. If such 
items preponderated, one would simply 
have a case of listing those self-reported 


symptoms which would practically never 
occur in normal persons but would fre- 
quently occur in psychoneurotics of the 
sort who score high on the neurotic triad. 
To be “normal” would raise your score 
in answering such items in the indicated 
direction simply because to be “normal” 
means being relatively free of such symp- 
toms. From the evidence that has been 
adduced up to now, it is clear that the 
major part of N cannot be interpreted 
in that light either. 

Another possible interpretation of N 
is that it is an “insight” scale in some 
sense. The persons who are in the hos- 
pital with Hs scores of 75 differ from 
those who are out of the hospital with Hs 
scores of 75 in that the latter have more 
“insight” into their own behavior and 
motivation and hence avoid the actual 
development of symptoms, even though 
they do in fact possess an equal amount 
of “hypochondriasis-component.” ‘This 
insight variable also shows up in a will- 
ingness to admit to oneself the sort of 
thing which appears on the N-scale. 

This hypothesis is not nearly so easy 
to refute as the hypothesis which posits 
Rosanoff’s positive contro] factor. In 
fact, a number of the findings can be 
brought to bear in support of it. The 
negative correlation with L and L, and 
with the 18 “O” items of the Hy scale 
would fall in line with such an interpre- 
tation; for it has always been empha- 
sized by the MMPI authors that the “lie” 
scale does not really mean lie but a ten- 
dency to put oneself in a good light, 
whether conscious or unconscious. The 
tendency of N to show elevated raw 
scores in cases with high Pt to a much 
greater extent than in cases with elevated 
neurotic triad might also be brought as 
evidence, since the psychasthenic is no- 
toriously more “insightful” at least 








54 PAUL E. 


verbally than the hypochondriac or the 
hysteric. 

Opposed to such an hypothesis, how- 
ever, are numbers of bits of data. First 
of all, on this assumption there is no 
reasonable account of the scores obtained 
from college persons. We must assume 
that the average college student or col- 
lege-educated person is about one SD 
below the mean of the general popula- 
tion in terms of how much “insight” he 
has into himself, which is conceivable to 
be sure but it is not supported by any 
other evidence. In the second place, we 
must assume that if a patient in a psycho- 
pathic unit with a psychosis or psycho- 
neurosis is not deliberately selected for 
having a “normal” profile (as were the 
criterion abnormals to some degree), he 
has as much of this insight as the average 
non-hospitalized person—fully as much 
in the case of male abnormals and only 
slightly less in the case of female abnor- 
mals. In other words, the insight hypo- 
thesis fails to account for the finding 
that abnormals show up as having less 
insight only when their other scores are 
restrained from appearing abnormal, but 
not when these other scores are allowed 
to vary freely. What makes the difference 
here is the scores, not the presence of 
abnormal symptoms themselves, it would 
appear. If I am enough of a hypochon- 
driac so that my relatives bring me to an 
institution for psychiatric treatment, I am 
likely to have as much “insight” as any- 
one else in the general population of 
normal persons, if this interpretation of 
N is correct. It is only when (in spite of 
being a hypochondriac) I fail to, show 
very elevated scores on the neurotic triad 
that my “insight” may be expected to be 
poorer than average—and according to 
the findings, not very much poorer at 
that. In this line of thought, the insight 


MEEHL 


postulated comes to be primarily in 
terms of the responses to test items, and 
not in terms of whatever kind of insight 
might protect a person with abnormal 
tendencies from developing the clinical 
symptoms of a neurosis. 

The correlation of N with Sc and, in 
the abnormals, with Pa does not tend to 
support such an interpretation either, 
nor do any of the positive correlations 
with abnormal components, nor those 
with G and “+.” Furthermore, it should 
be remembered that some of the scales 
on MMPI are “symptomatic” themselves 
and cannot readily be regarded as 
measuring basic components which ap- 
pear or are inhibited from appearing, as 
the case may be. Thus, in the neurotic 
triad scales on which the N-scale was 
originally derived, to get a high score 
often simply means to complain of those 
symptoms which, if actually present, 
would entitle one to be called whatever 
diagnosis is written above the score on 
the profile. To obtain a high score on 
Hs, for example, it is necessary and 
sufficient to say that one often feels weak 
all over, that one’s hand shakes when he 
tries to do something, that one’s head 
seems to hurt all over, that one is 
bothered by stomach trouble often, that 
one is not in so good health as his friends, 
and the like. These statements are the 
complaints which, if he believes them and 
talks about them and acts as if they were 
real, define the hypochondriac clinically. 
How is it possible to say that the “nor- 
mals” who have sufficient insight as 
measured by N avoid being hypochon- 
driacs in spite of answering the items in 
this way, when answering the items in 
this way is per se hypochondriacal? Simi- 
larly, in the case of depression, to say “I 
cry easily,” “I brood a great deal,” “I 
don’t seem to care what happens to me,” 





—, if) 





A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 55 


“I am not happy most of the time,” and 
so on, would seem to be prima facie evi- 
dence of actual depression, if the person 
is using these words in the way in which 
most people use them to describe their 
psychological condition. 

Lastly, consider the N items them- 
selves. Persons who receive deviant scores 
on the neurotic triad but say that they 
are easily angered, that a windstorm 
terrifies them, that they are afraid of fire, 
that they have a fear of water (to men- 
tion some among the most potent), have 
a much better chance of being out of the 
hospital than those who do not assert 
these unpleasant things about them- 
selves. It seems stretching the point con- 
siderably to say that for a person _to 
admit that a windstorm terrifies him 
shows he has more insight than another 
person who does not admit this, and 
that, if they both have an equal amount 
of hypochondriacal, depressive, or hy- 
steroid components, the former is more 
likely to remain out of the hands of a 
psychiatrist. It seems much more parsi- 
monious to the present writer to doubt 
that all the “normals” who said “A wind- 
storm terrifies me’? were merely mani- 
festing more honesty and insight than 
the abnormals who were equally terri- 
fied; and I suspect that neither group was 
especially terrified but that those who 
said that they were terrified were not 
actually so at all. This interpretation 
brings us to the last hypothesis, which 
will be defended as being somewhat more 
plausible than the others. 

In terms of the list of possible ways in 
which a “test miss” can occur (Chapter 
II), it seems to the writer that the most 
reasonable place to assign the source of 
N’s variance and discriminating power 
is under Group I, the “Errors of Meas- 
urement” category. This includes the 


cases in which the personality variable 
underlying the scale response does not 
actually exist to the extent indicated by 
the score. In terms of this category, the 
“normals” with depression scores of 85 
are not out of the hands of the psychia- 
trist because they have a lot of “control” 
which holds this depression component 
in check; nor are they outside because 
they have failed to show certain very 
gross signs and symptoms as part of their 
depression; nor are they outside because 
they have enough “insight” to keep their 
depression from incapacitating them. 
They are most probably outside because 
they are simply not nearly so depressed 
(in the biophysical sense) as the people 
inside, and their depression score of 85 
is spuriously high because of some other 
personality or verbal factor. ‘The present 
hypothesis is that it is this factor of 
“plus-getting” that is detected by the N- 
scale. 

In terms of the list of possibilities 
given under “Errors of Measurement,” 
categories A-1 and A-2 have been system- 
atically excluded in the present study by 
the insistence upon very rigorous stand- 
ards of validity in terms of profile form, ? 
and F scales. The only other possibilities 
are ‘Unusual interpretation of the mean- 
ing of the questions of a sort other than 
those interpretations related to the per- 
sonality trait in question,” and “The 
patient does not see the facts as they are.” 
It has been pointed out previously that 
unusual interpretation as such, and not 
seeing the facts as they are as such, do 
not necessarily give rise to test misses, 
because of the deliberately projective ele- 
ment in the whole theory of the MMPI. 
But there may be some personality traits, 
as well as some quite superficial verbal 
patterns, which are almost unrelated to 
the variable with which a given item is 


Mey 








eee 


y 
& 
. 





56 PAUL E. 


loaded and for the discrimination of 
which it was empirically selected. The 
traits and patterns are the “irrelevant” 
components of the predictor as described 
by Horst (16), and the hypothesis I wish 
to consider seriously is that the N scale 
acts as a suppression variable for some 
of these irrelevant components. 

The detailed character of these ten- 
dencies I am, of course, unprepared to 
elucidate. The crudest description would 
be to say merely that they involve the 
tendency to get high scores on personality 
tests of the MMPI variety, which is say- 
ing little more than the empirical find- 
ings except that it points the way to an 
interpretation in terms of test-taking as 
such instead of one in terms of psychiatric 
personality components and dynamisms 
as did the other hypotheses, ‘There have 
been observed by Professor Hathaway 
and the writer a number of persons who 
were excessively preoccupied with their 
own psychiatric status, who were rumi- 
native and anxious about their own state 
of mind and their own feelings, who dis- 
played the “psychasthenoid tempera- 
ment” in a modified and watered form 
without being really psychasthenic in the 
sense of interfering compulsions and ob- 
sessions. Many of these persons are per- 
petual “plusgetters” on personality tests, 
and in fact an interest in them was, as 
has been pointed out before, the starting 
point for the present study. When the 
present investigation began, the Rosan- 
off-Humm-Wadsworth hypothesis had 
structured the field so that the failure of 
these high-scoring “normals” to be in a 
mental hospital or at least incapacitated 
for their normal functions was envisaged 
in terms of Category II instead of Cate- 
gory I. On the present evidence it seems 
more fruitful to conceive of these, persons 
as simply mismeasured cases, and the 


MEEHL 


practical problem as one of improving 
the instrument so as to suppress the 
verbal-personality components that lead 
to the mismeasurement. 

It is not denied in the present hypo- 
thesis that “personality” factors are in- 
volved in these components, whatever 
they may turn out to be. That they are 
allied to whatever the psychasthenia scale 
measures is clear from the correlation 
with Pt, as is also the case for Sc. But the 
difference between this and Category 
II-B-1 (“other traits inhibit the trait in 
question”) is that the other traits in- 
volved do not inhibit the measured trait 
but rather prevent it from being accur- 
ately measured. Thus, a high N score 
does not indicate that the personality 
component measured by N or Pt is acting 
to “restrain” a man’s depression com- 
ponent—an interpretation that is far 
fetched to begin with in the light of what 
psychasthenia is like subjectively—but 
rather that a man’s depression compo- 
nent as well as his psychasthenia com- 
ponent are not so strong as the scores on 
D and Pt would indicate, because he has 
an excessive amount of whatever it is 
that is measured by N and produces high 
scores on psychiatric scales. 

The extent to which N involves factors 
closely allied to psychasthenia proper as 
contrasted with more superficial and 
semantic trends leading to high scores 
on Pt and Sc alike will not even be esti- 
mated here because there is no evidence 
on the point. The correlation with Sc 
remains mysterious except for the minor 
point that there is an element of semantic 
confusion and unrealistic thinking, which 
is common to these pseudo-psychasthenic 
persons and to schizoid persons alike, 
but which in the latter cases assumes a 
much more malignant form. “I wish I 
could be as happy as other people seem 





re 
hy 
th 


Ww 
ak 
T 
de 
de 


as 
hz 
ne 
er 
pr 


tr; 
bi 
Or 
fa 
de 
m 








A GENERAL NORMALITY OF CONTROL FACTOR IN PERSONALITY TESTING 57 


to be” may represent the distress of an 
acutely miserable person with a genuine 
depression. But it may also represent the 
superficial complaint of ruminative, un- 
realistic, semantically dissociated person 
who, to be sure may be feeling blue but 
has no real conception of what it means 
to be “depressed” as is a manic-depres- 
sive. It is something akin to this kind of 
characteristic that I have in mind when 
I use the phrase “plus-getting” as de- 
scriptive of N. 

The high positive correlation of N 
with Pt and Sc, the high negative cor- 
relation of N with L,, the high positive 
correlation with C,, which is simply a 
“correction scale” (suppression variable) 
for hypochondriasis introduced because 
the old H scale was often elevated m 
psychiatric patients free of hypochon- 
driasis, the high negative correlation of 
N with the “psychiatric denial” items of 
the Hy scale, all of these findings are 
readily understood in the light of the 
hypothesis that N measures ‘“‘plus-get- 
ting’ tendencies. That such a trait 
should show up correlated positively 
with almost all of the MMPI scales for 
abnormal components is to be expected. 
That it should differentiate normals with 
deviant profiles from abnormals with 
deviant profiles is understandable since 
it means that the normals are not deviant 
as their profiles suggest. On the other 
hand, that normals in general should 
not differ much from abnormals in gen- 
eral (unless the latter achieve normal 
profiles) is also understandable, since the 
N scale is not measuring a personality 
trait of intrinsic psychiatric significance 
but one which shows up as significant 
only because a test has been taken. The 
fact that a low raw score on N with a 
deviant profile indicates clinical abnor- 
mality means merely that if a person 





does not have much of this plus-getting 
tendency, the ‘plus’ responses he gave 
are likely to indicate a genuine psychi- 
atric deviation. On the other hand, a 
high raw score in an abnormal person is 
not difficult to understand, since there 
is in this interpretation no reason to deny 
a lot of the plus-getting tendency to an 
abnormal person any more than to a 
normal one. 

It must be admitted that the present 
hypothesis is hardly detailed enough to 
be theoretically satisfying, but it would 
be illegitimate to do more than hint at 
its nature on such flimsy evidence as now 
exists, To design experiments having this 
hypothesis in mind is difficult, for it is 
not possible to use as criteria of “genu- 
ine” psychiatric deviation any of the in- 
dicators—responses to test items, self- 
descriptions in interviews—which are 
presumably greatly affected by the factor 
hypothesized. One possible approach 
would be an objective (say, multiple- 
choice) test based upon those items on 
N which are answered in the scored 
direction, in an effort to find out whether 
the differences in interpretation could 
be elicited through verbal means alone. 
Thus, to say “I am afraid of fire’ may be 
a sign of a phobic reaction in a clinically 
compulsive-obsessive patient. But this 
response also scores a point on N, in 
which case it may merely reflect a differ- 
ence in interpretation on an almost 
purely semantic basis. There is, for ex- 
ample, an “obvious” sense in which 
everyone is afraid of fire, in that everyone 
will have anxiety if trapped by fire, 
everyone will avoid putting his hand in 
the flames, and so on. Yet the item fre- 
quencies for the normal population show 
that to interpret the item in this “ob- 
vious” way is a rare thing. The majority 
of “normal” persons do not say they are 





58 PAUL E. 


afraid of fire, presumably because they 
interpret the question in a different way. 
A criterion “normal” who says “I often 
dream about things that are best kept 
to myself” gets scored a point for de- 
pression. If he says many things of this 
sort, he gets a high T score on depression. 
The unusual semantics, self-deprecation, 
verbal pessimism, or “psychiatric hypo- 
chondriasis” which leads him to respond 
in this way may also, however, lead him 
to say “I am afraid of fire’’ and by so 
doing raise his score on N. How often 
does “often” mean when he says he 
often dreams about things best kept to 
himself? ‘To remove the ambiguity of 
that “often” would lower the discrimina- 
tion of many MMPI items, by reducing 
them to mere self-rating devices. But 
when we leave the ambiguity in, we 
leave it in at a price, due to the existence 
of other variables besides those we are 
testing by that item. If N is loaded also 
by some of these other variables, and not 
heavily loaded with depression itself, 
then we can try to suppress the non- 
depression components of the D-scale by 
the use of N. An intensive study of devi- 
ant-scoring “normals” with high and low 
raw scores on N using a multiple-choice 
method might shed some light on this 
problem. When a subject says “I am 
afraid of fire,” he could indicate later 
whether he interpreted the item to mean 
“Thinking of fire gives me anxiety,” “I 
become rather nervous when near a large 
fire,’ “I would probably be frightened 
if caught in a burning house,” “It is a 
fact that fire can be dangerous to a per- 
son,” and so forth. 

Of the various facts collected so far 
regarding the N-scale, the only one 
which does not actively support the 
present interpretation is the deviation of 
college groups, At least the present hypo- 





MEEHL 


thesis has the merit that, unlike the 
others, it does not come into any real 
conflict with this finding, although it can 
certainly not be said to predict it as a 
logical consequence. Whether college 
merely selects persons of sufficiently high 
verbal intelligence to minimize the prob- 
abilities of such semantic distortions as 
have been suggested remains to be veri- 
fied or refuted by further experimenta- 
tion. But that college educated persons 
should have less of a tendency than 
people generally to interpret personality 
test items in such a way as to put them- 
selves in a “bad” light psychiatrically— 
possibly also as a result of greater verbal 
test sophistication—is at least a good a 
priori possibility compared with the im- 
plications of the alternative hypotheses 
regarding N. 

In conclusion it must be admitted that 
the psychological nature of the chief fac- 
tors contributing to the N score remains 
essentially undetermined on the present 
evidence. The hypothesis tentatively sup- 
ported here leads to interesting and im- 
portant possibilities for investigation of 
the whole problem of improving ques- 
tion-answer type personality inventories. 
As was suggested in Chapter II, it is not 
inconceivable that the future develop- 
ment of this type of personality measur- 
ing device will be aided in large part 
by the systematic analysis of personality- 
test-taking as a behavioral process. Such 
study will certainly increase our ability 
to construct more valid personality tests 
over what would be possible with the 
more traditional approach, which con- 
ceives of such tests as simple self-rating 
scales in which “error” is accepted as 
inevitable because of the basic untrust- 
worthiness of human ratings and particu- 
larly those ratings demanding “objec- 
tive” judgments concerning the self. 








CHAPTER VII 


SUMMARY AND CONCLUSIONS 


HE ORIGINAL aim of the present in- 
faa ree was the isolation of a 
scale of items on the Minnesota Multi- 
phasic Personality Inventory which 
would quantify the hypothetical factor of 
general “‘normality” or “control” as de- 
scribed by Rosanoff and purportedly 
measured previously in the Humm-Wads- 
worth ‘Temperament Scale. The proce- 


general population, and that the lower 
such a raw score becomes, the greater 
are the probabilities that the person is 
abnormal clinically. If a person shows a 
deviant profile and his N-score is as small 
as one standard deviation below the nor- 
mal population mean, he is practically 
certain to be psychiatrically involved to 
the extent of actually being under psy- 


y dure was one of empirically isolating chiatric care. Exceptions to this rule are 
- items from the total item pool of the almost completely confined to persons 
- MMPI on the basis of their discrimina- who are in college or have been gradu- 
vl tion between a criterion group of ap- ated from college; this group deviates on 
a parently normal persons showing marked the average a full standard deviation 
I- deviations on the three scales of the from the mean of the general popula- 
'S “neurotic triad,” and a matched group tion. It was suggested that further in- 
of clinically abnormal persons with simi- vestigation may warrant the construction 
it lar profiles. ‘The scale of 78 items result- of special norms for the college group. 
C- ing from this analysis was called N for It was shown that unselected abnor- 
Is “normal” although this term here indi- mals tend to score about the same as the 
it cates its differentiating power only and_ general population, with a slight ten- 
)- does not imply any particular dynamic dency in the case of female abnormals to 
- interpretation of its operation. show lower raw scores than normals. In 
of The N scale was found to have a fairly _ the case of abnormals, there was evidence 
S- satisfactory odd-even and test-retest reli- that N functions as an “inverted lie” 
S. ability, and was shown to continue dis- scale to some extent, in that hospitalized 
rt crimination when applied to a new abnormals of heterogeneous diagnosis 
D- “test” group of deviant-scoring normals. but with normal-appearing MMPI pro- 
r- There was also evidence of some degree __ files tended to show low raw scores on N. 
rt of generality in that it showed statistically A minor experiment in “blind” diag- 
y- significant differentiation between “test’’ nosis indicated that the use of N by an 
h normals and abnormals deviating on unskilled person in a purely mechanical 
ty other components than the neurotic triad way would enable him to differentiate 
ts on which the derivation was based. This _ the profiles of normal from those of ab- 
1€ latter discrimination, however, was not normal persons as well as or better than 
n- nearly so good in terms of overlap and_ could be done by the judgment of per- 
1g could hardly be of any practical utility. sons highly familiar with MMPI and 
as The overall finding from the study of _ skilled in the interpretation of profiles. 
3t- the neurotic triad seemed to be that the It was discovered that N was correlated 
u- 


typical deviant-scoring normal scores in 
the neighborhood of one standard devi- 
ation (raw score) above the mean of the 


in varying degrees with the other scales 
of MMPI, in almost all cases significantly 
positive. The highest correlations were 





59 





60 PAUL E. 


with the scales for psychasthenia and 
schizophrenia among the _ personality 
components proper, and with a discarded 
correction scale for the hypochondriasis 
component. Negative correlations were 
found with scales indicating a tendency 
to put oneself in a favorable light, and 
with a sub-set of items from the hysteria 
scale measuring a tendency to deny that 
one has social or psychiatric maladjust- 
ments. There was also found a positive 
correlation of N with chronological age. 

Study of the items themselves shows 
that they are in the very great majority 
of instances answered by the “normals” 
who have abnormal appearing profiles 
in the direction which is statistically ab- 
normal in the unselected general popula- 
tion sample, and which furthermore is 
the direction that ordinary judgment 
would consider the unhealthy, unhappy, 
or psychiatrically undesirable direction. 
Those items on N which are common to 
both it and the “normal” component of 
the Humm-Wadsworth are scored in the 
opposite direction in every case on the 
two keys. This paradoxical finding com- 
bined with differential and correctional 
data leads to a definite rejection of any 
interpretation of N in terms of Rosan- 
off’s active “normalizing” component as 
originally described. 

Hypotheses as to the psychological 
nature of N are considered briefly. The 
interpretation of N as being simply a 
denial of the most severe and pathog- 
nomonic symptoms is immediately re- 
jected because of its failure to distinguish 
normals from abnormals generally to 
any great extent, combined with the 
actual content of the items. An interpre- 
tation of N in terms of the greater “‘in- 
sight’”’ shown by the deviant-scoring nor- 
mals in admitting to the failings scored 
on N is admitted as more prausiye but 





MEEHL 


finally rejected because of the college 
findings, the correlation with other 
scales, the lack of good differentiation 
between normals and abnormals in gen- 
eral, and the actual content of the most 
potent items. 

The provisional hypothesis offered is 
that the N scale is loaded with a “plus- 
getting’ tendency as regards personality 
tests, possibly allied to a kind of spurious 
psychasthenia shown by certain persons 
who often achieve extremely pathologi- 
cal scores on such tests as MMPI with a 
minimum of genuine suffering and com- 
plaint. The relative contribution of per- 
sonality factors versus non-personological 
verbal and semantic factors is unknown 
at present. It is suggested that the mark- 
edly deviant neurotic scores shown by 
those normals who also have high raw 
scores on N reflect a distortion of re- 
sponse attributable to a component the 
influence of which must be artificially 
“suppressed” by the use of scales such as 
N in order to reduce test misses. In other 
words, abnormal profiles may arise from 
a genuine abnormality as indicated by 
the scale; but they may also arise from 
“deviations” which, while possibly re- 
lated to various aspects of personality, 
are not the deviations for which the 
scale was built. To correct or partial out 
the influence of these other components 
requires that we first obtain measures of 
them, and the present N-scale is tenta- 
tively offered as an example of such 
measurement, It is suggested that a direct 
attack upon this hypothesis concerning 
N might be made via the use of multiple- 
choice situations applied to those persons 
who do deviate in regard to their N- 
scores, in order to find out whether the 
semantic-verbal distortions hypothesized 
really occur. 

Finally even though the differentiating 





A GENERAL 


powers of the present N-scale are not 


nearly so great as could be desired for 


practical reasons, the mere fact that sig- 


nificant differentiation 


exists with the 


items being of such a nature and scored 


in such a direction as they are, indicates 
that a pursuit of this lead in the field of 


NORMALITY OF CONTROL 


FACTOR IN 





PERSONALITY TESTING 61 


Once we have progressed to the point of 
sophistication which abandons the ques- 
tionnaire as a simple self-rating situation 
and infers the significance of responses 
from the groups differentiated rather 
than the converse, new and important 
advances may be expected with confi- 


personality testing may be well rewarded. dence. 
BIBLIOGRAPHY 

1. ALLPoRT, G. W. Personality. New York: ality inventory. Minneapolis: Univ. of 
Henry Holt, 1937. Minnesota Press, 1943. 

2. ARNOLD, D. C. The clinical validity of the 14. MCKINLEy, J. C., and HATHAWAy, S. R. The 
Humm-Wadsworth Temperament Scale in Minnesota Multiphasic Personality Inven- 
psychiatric diagnosis. Unpublished Ph. D. tory: V. Hysteria, hypomania, and psycho- 
thesis. Univ. Minn., 1942. pathic deviate. J. appl. Psychol., 1944, 28, 

g. Benton, A. L. The interpretation of ques- 153-174. 
tionnaire items in a personality schedule. 15. Howetts, T. R. An experimental study of 
Arch. Psychol., N.Y., 1935, No. 190. persistence. J. abnorm.,. soc. Psychol., 1933, 

4. Brocpen, H. E. A factor analysis of forty 28, 14-28. 
character tests. Psychol. Monogr., 1940, 52, 16. Horst, PAuL. The prediction of personal 
39-55. adjustment. New York: Social Science Re- 

5. Dorcus, R. M. A brief study of the Humm- search Council, Bulletin No. 48, 1941. 
Wadsworth ‘Temperament Scale and the 17. HumM, PD. G., and WaApswortH, G. W. The 
Guilford-Martin Personnel Inventory in an Humm-Wadsworth Temperament _ Scale. 
industrial situation, J. Appl. Psychol., 1944, Amer. J. Psychiat., 1935, 92, 163-200. 

28, 302-307. 18. HumMM, D. G. Discussion of “A statistical 

6. Dysincer, D. W. A critique of the Humm- analysis of the Humm-Wadsworth Tempera- 
Wadsworth Temperament Scale. J. Abnorm. ment Scale.” J. appl. Psychol., 1939, 23, 525- 
soc. psychol., 1939, 34, 73-83- 526. 

>. Fisuer, R. A. Statistical methods for research 19. HUMM, D. G. Dysinger’s critique of the 
workers. London: Oliver and Boyd, 1941. Humm-Wadsworth Temperament Scale. J. 

8. FREEMAN, F. L. Toward a psychiatric plimsoll abnorm. soc. Psychol., 1939, 34, 402-403. 
mark: physiological recovery quotients in 20. HUMM, D. G., STORMENT, R. C., and IorNs, 
experimentally induced frustration. J. Psy- M. E. Combination scores for the Humm- 
chol., 1939, 8, 247-252. Wadsworth Temperament Scale. J. Psychol., 

g. HATHAWAY, S. R., and McKINLFy, J. C. A 1939, 7, 227-253. 
multiphasic personality schedule: I. Con- 21. HumMM, D. G. Personality and adjustment. 
struction of the schedule. J. Psychol., 1940, J. Psychol., 1942, 13, 109-134. 

10, 249-254. 22. HuMM, D. G. Discussion of Dorcus’ study of 

10. McKINLFY, J. C., and HarHaway, S. R. A the Humm-Wadsworth Temperament Scale. 
multiphasic personality schedule: Il. A dif- J. appl. Psychol., 1944, 28, 527-529. 
ferential study of hypochondriasis, J. Psy- 23. KRUGER, BARBARA H. A statistical analysis of 
chol., 1940, 10, 255-268. the Humm-Wadsworth Temperament Scale. 

11. HATHAWAY, S. R., and MCcKINLEy, J. C. A J. appl. Psychol., 1938, 22, 641-652. 
multiphasic personality schedule: III. The 24. LEVERENZ, MAjor C. W. Minnesota Multi- 
measurement of symptomatic depression. J. phasic Personality Inventory: An evaluation 
Psychol., 1942, 14; 73-74- of its usefulness in the psychiatric service 

12, MCKINLEY. J. C., and HatHAway, S. R. A of a station hospital. War Med., 1943, 4, 
multiphasic personality schedule: IV. Psy- ~ 618-629. 
chasthenia. J. appl. Psychol., 1942, 26, 614- 25. MALLER, J. B. General and specific factors 


624. 
. HATHAWAY, S..R., and MCKINLEy, J. C. Man- 
ual for the Minnesota multiphasic person- 


in character. J. soc. Psychol., 1934, 5, 97-102. 


. MALLER, J. B. Personality tests. In J. McV. 


Hunt, (Ed.), Personality and the behavior 








ho 
~I 


s PAUL E. 


disorders. New York: Ronald Press, 


Pp. 170-213. 


1944. 


. McQuirry, L. L. An approach to the nature 


and measurement of personality integration. 
J. soc. Psychol., 1941, 13, 3-14. 

Mises, R. von, Probability, statistics, and 
truth. New York. Macmillan, 1939. 


. REICHENBACH, H. Experience and prediction. 


Chicago: Univ. of Chicago Press, 1938. 


. RosANOFF, A. J. Manual of psychiatry (7th 


Ed.). New York: John Wiley, 1938. 


. ROSENZWEIG, SAUL, Frustration as an experi- 


mental problem: VI. A general outline of 
frustration, Character and Pers., 1938, 7, 


151-160, 





MEEHL 


32. 


. ROSENZWEIG, S. 


. ZUBIN, J. 


ROSENZWeEIG, SAUL. A dynamic interpretation 
of psychotherapy oriented toward research. 
In S. Tomkins, (Ed.), Contemporary psy- 
chopathology. Cambridge: Harvard Univ. 
Press, 1943. 

An outline of frustration 
theory. In J. McV. Hunt, (Ed.), Personality 
and the behavior disorders. New York: 
Ronald Press, 1944. 

Nomographs for determining the 
significance of the differences between the 
frequencies of events in two contrasted 
oups. J. Amer, statist. ASs., 1939, 345 539- 


sr 
545- 











' 
. 
; 
4 
i 
‘ 
j 
‘ 
. : * 
#\ 
- 
- 
i 
% 
‘ 





